The Performance of K-Means and K-Modes Clustering to Identify Cluster in Numerical Data
AbstractCluster analysis is a formal study of methods and algorithms for natural grouping of objects according to the perceived intrinsic characteristics and the measure similarities in each group of the objects. The pattern of each cluster and the relationship for each cluster are identified, then they are related to the frequency of occurrence in the data set. Meanwhile, the mean and the mode are known as the measures of central tendency in a distribution. In clustering, the mean and the mode are applied as a technique to discover the existing of the cluster in the data set. Therefore, this study aims to compare the performance of K-means and K-modes clustering techniques in finding the group of cluster that exists in the numerical data. The difference between these methods is that the K-modes method is usually applied to categorical data, while K-means method is applied to numerical data. However, both methods would be used to cluster the numerical data in this study. Moreover, performance of these two clustering methods are demonstrated using the output from R software. The results obtained are compared such that the method giving the best output could be determined. In conclusion, the efficiency of the methods is highly presented.
Open access licenses
Open Access is by licensing the content with a Creative Commons (CC) license.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.