Classification of Spear Phishing Email using Machine Learning Approach

Mohammad Akmal Afif Mohd Zuhdi; Isredza Rahmi A Hamid

Authors

Mohammad Akmal Afif Mohd Zuhdi Universiti Tun Hussein Onn Malaysia
Isredza Rahmi A Hamid Universiti Tun Hussein Onn Malaysia

Keywords:

Spear Phishing, Email Classification, Machine Learning

Abstract

The prevalence of spear phishing attacks targeting organizations is on the rise, accompanied by an increasing diversity in the techniques employed within spear phishing emails. Although previous research has focused on identifying phishing emails based on their headers, bodies, or attachments, this study aims to tackle spear phishing email classification using a machine learning approach. The research will focus on content-based features rather than headers, bodies, or attachment. The proposed spear phishing email classification model comprises seven distinct phases: raw data acquisition, data pre-processing, feature extraction, n-fold cross-validation, classification algorithm selection, email classification, and model performance evaluation. For this experiment, content-based features extracted from the Enron dataset will be utilized. The model's effectiveness will be assessed using the Random Forest and Naïve Bayes classification algorithms, with evaluation metrics including AUC, precision, F1-score, and recall. Random Forest performed exceptionally well with an Area Under Curve (AUC) score of 0.996, F1-Score of 0.968, precision of 0.969, and recall of 0.967. Naïve Bayes achieved moderate results: AUC 0.742, F1-Score 0.701, precision 0.677, and recall 0.727.

Downloads

Download data is not yet available.

Classification of Spear Phishing Email using Machine Learning Approach

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

How to Cite

Make a Submission

info

proceedings

index