Table of Contents
Fetching ...

Leveraging Large Language Models for Cybersecurity: Enhancing SMS Spam Detection with Robust and Context-Aware Text Classification

Mohsen Ahmadi, Matin Khajavi, Abbas Varmaghani, Ali Ala, Kasra Danesh, Danial Javaheri

TL;DR

This study tackles SMS spam detection as a text-classification problem and systematically compares six classifiers across BoW and TF-IDF feature representations. The results show TF-IDF consistently outperforms BoW, with Naive Bayes achieving the top accuracy of 96.2% when paired with TF-IDF, followed by SVM at 94.5% and DNN at 91.0%. Using a Kaggle SMS dataset with 5,572 messages (notably imbalanced toward ham), the work highlights the importance of feature representation for classification performance and demonstrates that simpler models with robust features can rival more complex architectures. The findings inform practical SMS filtering by recommending TF-IDF with NB, SVM, or DNN for reliable spam detection, while also outlining limitations and avenues for extending this work with embeddings and real-time systems.

Abstract

This study evaluates the effectiveness of different feature extraction techniques and classification algorithms in detecting spam messages within SMS data. We analyzed six classifiers Naive Bayes, K-Nearest Neighbors, Support Vector Machines, Linear Discriminant Analysis, Decision Trees, and Deep Neural Networks using two feature extraction methods: bag-of-words and TF-IDF. The primary objective was to determine the most effective classifier-feature combination for SMS spam detection. Our research offers two main contributions: first, by systematically examining various classifier and feature extraction pairings, and second, by empirically evaluating their ability to distinguish spam messages. Our results demonstrate that the TF-IDF method consistently outperforms the bag-of-words approach across all six classifiers. Specifically, Naive Bayes with TF-IDF achieved the highest accuracy of 96.2%, with a precision of 0.976 for non-spam and 0.754 for spam messages. Similarly, Support Vector Machines with TF-IDF exhibited an accuracy of 94.5%, with a precision of 0.926 for non-spam and 0.891 for spam. Deep Neural Networks using TF-IDF yielded an accuracy of 91.0%, with a recall of 0.991 for non-spam and 0.415 for spam messages. In contrast, classifiers such as K-Nearest Neighbors, Linear Discriminant Analysis, and Decision Trees showed weaker performance, regardless of the feature extraction method employed. Furthermore, we observed substantial variability in classifier effectiveness depending on the chosen feature extraction technique. Our findings emphasize the significance of feature selection in SMS spam detection and suggest that TF-IDF, when paired with Naive Bayes, Support Vector Machines, or Deep Neural Networks, provides the most reliable performance. These insights provide a foundation for improving SMS spam detection through optimized feature extraction and classification methods.

Leveraging Large Language Models for Cybersecurity: Enhancing SMS Spam Detection with Robust and Context-Aware Text Classification

TL;DR

This study tackles SMS spam detection as a text-classification problem and systematically compares six classifiers across BoW and TF-IDF feature representations. The results show TF-IDF consistently outperforms BoW, with Naive Bayes achieving the top accuracy of 96.2% when paired with TF-IDF, followed by SVM at 94.5% and DNN at 91.0%. Using a Kaggle SMS dataset with 5,572 messages (notably imbalanced toward ham), the work highlights the importance of feature representation for classification performance and demonstrates that simpler models with robust features can rival more complex architectures. The findings inform practical SMS filtering by recommending TF-IDF with NB, SVM, or DNN for reliable spam detection, while also outlining limitations and avenues for extending this work with embeddings and real-time systems.

Abstract

This study evaluates the effectiveness of different feature extraction techniques and classification algorithms in detecting spam messages within SMS data. We analyzed six classifiers Naive Bayes, K-Nearest Neighbors, Support Vector Machines, Linear Discriminant Analysis, Decision Trees, and Deep Neural Networks using two feature extraction methods: bag-of-words and TF-IDF. The primary objective was to determine the most effective classifier-feature combination for SMS spam detection. Our research offers two main contributions: first, by systematically examining various classifier and feature extraction pairings, and second, by empirically evaluating their ability to distinguish spam messages. Our results demonstrate that the TF-IDF method consistently outperforms the bag-of-words approach across all six classifiers. Specifically, Naive Bayes with TF-IDF achieved the highest accuracy of 96.2%, with a precision of 0.976 for non-spam and 0.754 for spam messages. Similarly, Support Vector Machines with TF-IDF exhibited an accuracy of 94.5%, with a precision of 0.926 for non-spam and 0.891 for spam. Deep Neural Networks using TF-IDF yielded an accuracy of 91.0%, with a recall of 0.991 for non-spam and 0.415 for spam messages. In contrast, classifiers such as K-Nearest Neighbors, Linear Discriminant Analysis, and Decision Trees showed weaker performance, regardless of the feature extraction method employed. Furthermore, we observed substantial variability in classifier effectiveness depending on the chosen feature extraction technique. Our findings emphasize the significance of feature selection in SMS spam detection and suggest that TF-IDF, when paired with Naive Bayes, Support Vector Machines, or Deep Neural Networks, provides the most reliable performance. These insights provide a foundation for improving SMS spam detection through optimized feature extraction and classification methods.

Paper Structure

This paper contains 29 sections, 25 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Linear situations are characterized by direct proportionality between variables – as one variable changes, the other does so in a predictable, constant way. These relationships can be visually represented by a straight line on a graph.
  • Figure 2: Conceptual diagram and workflow
  • Figure 3: The confusion matrices for various machine learning models utilizing the Bag-of-Words (BoW) feature extraction technique
  • Figure 4: The ROC curve comparing different machine learning models employing the Bag-of-Words (BoW) feature extraction method
  • Figure 5: The scree plot illustrating the feature reduction of TF-IDF features using Principal Component Analysis (PCA)
  • ...and 2 more figures