Machine Learning Approach to Predict AXL Kinase Inhibitor Activity for Cancer Drug Discovery Using XGBoost and Bayesian Optimization
Keywords:
Supervised learning, QSAR, molecular descriptors, ChEMBL, hyperparameter tuningAbstract
Cancer persists as a significant global health challenge, marked by uncontrolled cell growth and the potential for metastasis, posing a substantial threat to human well-being. Recent years have witnessed notable progress utilizing machine learning for cancer drug discovery. This study employs the XGBoost algorithm and Bayesian optimization to classify AXL kinase inhibitor activity in cancer drug discovery. A comprehensive dataset of 1074 compounds and their IC50 values was obtained from the ChEMBL database. Molecular descriptors were calculated using the Mordred Python library, providing a detailed profile of each compound. The XGBoost model optimized by Bayesian optimization demonstrated superior performance, achieving an accuracy of 86.24%, precision of 89.52%, recall of 89.52%, and an F1-score of 89.52%. Comparative analysis with other machine learning models further highlighted XGBoost's efficacy. A Principal Component Analysis (PCA) plot demonstrated the model's broad applicability domain, providing reliable predictions within defined boundaries to assess applicability. The study's implications extend to practical pharmaceutical research, serving as a screening tool to prioritize compounds for synthesis and testing, potentially streamlining the drug development pipeline.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Soft Computing and Data Mining
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.