E-mail Spam Filtering using Genetic Algorithm based on Probabilistic Weights and Words Count
Keywords:Bayes, Naive Bayes, Spam, Ham, Genetic Algorithm
Spam email filtering is a hot area of research, as they are growing with time. Most of the spam mails are promotional in nature. Therefore, spam mails are not harmful for the computers, but these mails are annoying for user. Spam mails can be filtered using spam filtering methods like Bayes and Naive Bayes classifications. Classification is done on the basis of content of the mail, or in particular on words and probability is calculated of finding a word from spam and ham classifier words. There are few words which can be found in both spam and ham mails, thus threshold based mechanism is desirable for correct classification. For correct classification using Bayes and Naive Bayes dataset should be huge ideally number of mails should be infinite. But in real applications a scheme is desired which is adaptive in nature and can provide good results with a few mails. In the similar direction, in this paper a genetic algorithm based spam detection method is detailed which is very simple and provide good results with limited dataset.
How to Cite
Open access licenses
Open Access is by licensing the content with a Creative Commons (CC) license.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.