Prediction of Tuberculosis Prevalence in Kelantan, Malaysia: Comparison of Model Prediction based on Lasso and Elastic Net Regression
Keywords:
Tuberculosis (TB), LASSO regression, Elastic Net regression, K-fold cross-validationAbstract
Tuberculosis (TB), caused by the bacterium Mycobacterium tuberculosis, is a leading cause of death, with an untreated mortality rate of approximately 50%. This study aimed to identify factors influencing the modelling prediction of TB prevalence in Kelantan, Malaysia, using LASSO and Elastic Net regression. The objectives included selecting the best model through model selection criteria and subsequently applying the chosen model to predict TB prevalence in Kelantan. This study utilizes cross-sectional data from TB patients in Kelantan, Malaysia, collected from individuals who underwent TB screening between 2019 and 2020. The results indicate that variables including TB incidence in low-incidence areas, primary education, ex-smokers, BCG vaccination, night sweats, weight loss, and loss of appetite significantly influence the model in both methods. AICc and BIC indicate the logistic LASSO regression outperformed than logistic Elastic Net regression for TB prevalence prediction. The model predicts varying probabilities for TB prevalence across different scenarios and conditions. In future research, it is essential to collaborate with healthcare institutions for comprehensive medical data and explore alternative methodologies like Artificial Neural Networks (ANNs) to contribute more impactful insights into predicting TB prevalence.



