Mini-batch k-Means versus k-Means to Cluster English Tafseer Text: View of Al-Baqarah Chapter
Keywords:Text mining, text clustering, k-means algorithm, mini-batch k-means algorithm, tafseer translation, Al-Baqarah chapter
Al-Quran is the primary text of Muslims' religion and practise. Millions of Muslims around the world use al-Quran as their reference guide, and so knowledge can be obtained from it by Muslims and Islamic scholars in general. Al-Quran has been reinterpreted to various languages in the world, for example, English and has been written by several translators. Each translator has ideas, comments and statements to translate the verses from which he has obtained (Tafseer). Therefore, this paper tries to cluster the translation of the Tafseer using text clustering. Text clustering is the text mining method that needs to be clustered in the same section of related documents. The study adapted (mini-batch k-means and k-means) algorithms of clustering techniques to explain and to define the link between keywords known as features or concepts for Al-Baqarah chapter of 286 verses. For this dataset, data preprocessing and extraction of features using TF-IDF (Term Frequency-Inverse Document Frequency), and PCA (Principal Component Analysis) applied. Results show two/three-dimensional clustering plotting assigning seven cluster categories (k=7) for the Tafseer. The implementation time of the mini-batch k-means algorithm (0.05485s) outperforms the time of the k-means algorithm (0.23334s). Finally, the features 'god', 'people', and 'believe' was the most frequent features.