Comparative Analysis of Naive Bayesian Techniques in Health-Related For Classification Task
Keywords:
Naïve Bayes, algorithms, data mining, classification.Abstract
Naïve Bayes is a technique of using algorithms based on the Naïve Bayes theorem, which utilizes naive assumptions of conditional independence among predictors to predict the class of unknown data sets. The problems that face classification techniques are the accuracy of the classification and the number of errors classifying. This research will explore the several different techniques that will give different results based on their respective algorithms. This research will focus on the comparative analysis of the differences in performance and type of variations of the Naïve Bayes classification. There are generally four applications that use Naïve Bayes, real-time prediction, multiclass prediction, text classification, and recommendation system. The methodology used in this project is based on CRISP-DM methodology and uses a multitude of phases for creating the project. To overcome the drawbacks of these issues, this research will apply an ensemble to the multiple classifiers in order to produce a better predictive model performance compared to a single model. The variations that differ from the Naive Bayes model are three, Gaussian, Multinomial, and Bernoulli model. These models fall under the same type of classification technique which uses the Bayes theorem. The Gaussian model is used in basic classification and assumes that the features of a dataset follow a normal distribution. The multinomial model, however, is used for discrete counts, such as counting how many numbers of times the outcome of x is observed over n number of trials. The Bernoulli model primarily focuses on searching for vector features that are binary. The objective is as follows, to apply and implement the original model Naïve Bayes with different existing models such as the Multinomial Naïve Bayes and the Gaussian, and the Bernoulli Naïve Bayes. The outcome of this study will focus on the differences, capabilities, and performance of the probabilistic classifier of the Naïve Bayes algorithms.