A Novel K-means-based Feature Reduction
The aim of feature reduction is reduction of the size of data file, elimination of irrelevant features, and discovery of the effective data features for data analysis. Irrelevant data features can skew data analysis such as data clustering. Therefore, maintaining the data structure or data clusters must be taken into consideration in feature reduction. In this article, with regard to the success of k-means-based clustering methods, a feature reduction method is presented based on weighted k-means (wk-means). More specifically, firstly, data features are weighted using wk-means method. A feature with a high weight is not a better feature for clustering than a feature with a low weight, necessarily, and the weight of a feature only change feature range for better clustering. Then, by using a novel mathematical model, a group of weighted features with the least effect on data clusters are eliminated and the remaining features are selected. Contrary to sparse k-means method, the number of selected features can be determined explicitly by the user in our proposed method. Experimental results on four real datasets show that the accuracy of clusters obtained by wk-means after feature reduction by the proposed method is better than that of sparse k-means, PCA and LLE.
Open access licenses
Open Access is by licensing the content with a Creative Commons (CC) license.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.