Table of Contents
Fetching ...

Towards Trustworthy Keylogger detection: A Comprehensive Analysis of Ensemble Techniques and Feature Selections through Explainable AI

Monirul Islam Mahmud

TL;DR

This paper addresses trustworthy keylogger detection by evaluating a comprehensive pipeline that combines ten learners with ensemble techniques (Voting, Stacking, Blending) and multiple feature selection methods (Information Gain, Lasso L1, Fisher Score). It emphasizes model interpretability by applying SHAP for global explanations and LIME for local explanations, demonstrating AdaBoost with Fisher Score features as the top configuration (99.76% accuracy, AUC 0.999) and achieving substantial feature reduction (about 45%). The work leverages the Kaggle Keylogger Detection dataset, uses SMOTE to balance classes, and reports robust performance across metrics such as accuracy, F1, and specificity, while providing actionable insights into which network features drive detection. The findings advance practical, explainable threat detection suitable for deployment in cybersecurity environments, with future directions including real-time deployment, cross-domain validation, and federated learning.

Abstract

Keylogger detection involves monitoring for unusual system behaviors such as delays between typing and character display, analyzing network traffic patterns for data exfiltration. In this study, we provide a comprehensive analysis for keylogger detection with traditional machine learning models - SVC, Random Forest, Decision Tree, XGBoost, AdaBoost, Logistic Regression and Naive Bayes and advanced ensemble methods including Stacking, Blending and Voting. Moreover, feature selection approaches such as Information gain, Lasso L1 and Fisher Score are thoroughly assessed to improve predictive performance and lower computational complexity. The Keylogger Detection dataset from publicly available Kaggle website is used in this project. In addition to accuracy-based classification, this study implements the approach for model interpretation using Explainable AI (XAI) techniques namely SHAP (Global) and LIME (Local) to deliver finer explanations for how much each feature contributes in assisting or hindering the detection process. To evaluate the models result, we have used AUC score, sensitivity, Specificity, Accuracy and F1 score. The best performance was achieved by AdaBoost with 99.76% accuracy, F1 score of 0.99, 100% precision, 98.6% recall, 1.0 specificity and 0.99 of AUC that is near-perfect classification with Fisher Score.

Towards Trustworthy Keylogger detection: A Comprehensive Analysis of Ensemble Techniques and Feature Selections through Explainable AI

TL;DR

This paper addresses trustworthy keylogger detection by evaluating a comprehensive pipeline that combines ten learners with ensemble techniques (Voting, Stacking, Blending) and multiple feature selection methods (Information Gain, Lasso L1, Fisher Score). It emphasizes model interpretability by applying SHAP for global explanations and LIME for local explanations, demonstrating AdaBoost with Fisher Score features as the top configuration (99.76% accuracy, AUC 0.999) and achieving substantial feature reduction (about 45%). The work leverages the Kaggle Keylogger Detection dataset, uses SMOTE to balance classes, and reports robust performance across metrics such as accuracy, F1, and specificity, while providing actionable insights into which network features drive detection. The findings advance practical, explainable threat detection suitable for deployment in cybersecurity environments, with future directions including real-time deployment, cross-domain validation, and federated learning.

Abstract

Keylogger detection involves monitoring for unusual system behaviors such as delays between typing and character display, analyzing network traffic patterns for data exfiltration. In this study, we provide a comprehensive analysis for keylogger detection with traditional machine learning models - SVC, Random Forest, Decision Tree, XGBoost, AdaBoost, Logistic Regression and Naive Bayes and advanced ensemble methods including Stacking, Blending and Voting. Moreover, feature selection approaches such as Information gain, Lasso L1 and Fisher Score are thoroughly assessed to improve predictive performance and lower computational complexity. The Keylogger Detection dataset from publicly available Kaggle website is used in this project. In addition to accuracy-based classification, this study implements the approach for model interpretation using Explainable AI (XAI) techniques namely SHAP (Global) and LIME (Local) to deliver finer explanations for how much each feature contributes in assisting or hindering the detection process. To evaluate the models result, we have used AUC score, sensitivity, Specificity, Accuracy and F1 score. The best performance was achieved by AdaBoost with 99.76% accuracy, F1 score of 0.99, 100% precision, 98.6% recall, 1.0 specificity and 0.99 of AUC that is near-perfect classification with Fisher Score.

Paper Structure

This paper contains 15 sections, 13 figures.

Figures (13)

  • Figure 1: Overall methodology.
  • Figure 2: Barplot Distribution.
  • Figure 3: Violin Plot Distribution.
  • Figure 4: Boxplot Distribution.
  • Figure 5: Target class distribution before and after SMOTE.
  • ...and 8 more figures