Comparative Analysis of Mice Protein Expression: Clustering and Classification Approach

  • Mohd Zainuri Saringat Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor, Malaysia
  • Aida Mustapha Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor, Malaysia
  • Rachmadita Andeswari

Abstract

The mice protein expression dataset was created to study the effect of learning between normal and trisomic mice or mice with Down Syndrome (DS). The extra copy of a normal chromosome in DS is believed to be the cause that alters the normal pathways and normal responses to stimulation, causing learning and memory deficits. This research attempts to analyze the protein expression dataset on protein influences that could have affected the recovering ability to learn among the trisomic mice. Two data mining tasks are employed; clustering and classification analysis. Clustering analysis via K-Means, Hierarchical Clustering, and Decision Tree have been proven useful to identify common critical protein responses, which in turn helping in identifying potentially more effective drug targets. Meanwhile, all classification models including the k-Nearest Neighbor, Random Forest, and Naive Bayes have efficiently classifies protein samples into the given eight classes with very high accuracy.

Published
2018-11-25
How to Cite
Saringat, M. Z., Mustapha, A., & Andeswari, R. (2018). Comparative Analysis of Mice Protein Expression: Clustering and Classification Approach. International Journal of Integrated Engineering, 10(6). Retrieved from https://publisher.uthm.edu.my/ojs/index.php/ijie/article/view/2779