Table of Contents
Fetching ...

Decoding Android Malware with a Fraction of Features: An Attention-Enhanced MLP-SVM Approach

Safayat Bin Hakim, Muhammad Adil, Kamal Acharya, Houbing Herbert Song

TL;DR

This paper tackles Android malware detection and family classification in a setting of highly obfuscated and evolving threats. It introduces a hybrid pipeline that combines an attention-enhanced MLP for robust representation learning with a Radial Basis Function SVM, using Linear Discriminant Analysis to reduce features from 512 to 14 after initial 47-feature selection from the CCCS-CIC-AndMal-2020 dataset. The approach achieves over 99% accuracy while drastically reducing feature dimensionality and providing SHAP-based explanations for model interpretability. The work demonstrates strong performance advantages over state-of-the-art methods, with implications for scalable, efficient, and explainable mobile threat detection in real-world deployments.

Abstract

The escalating sophistication of Android malware poses significant challenges to traditional detection methods, necessitating innovative approaches that can efficiently identify and classify threats with high precision. This paper introduces a novel framework that synergistically integrates an attention-enhanced Multi-Layer Perceptron (MLP) with a Support Vector Machine (SVM) to make Android malware detection and classification more effective. By carefully analyzing a mere 47 features out of over 9,760 available in the comprehensive CCCS-CIC-AndMal-2020 dataset, our MLP-SVM model achieves an impressive accuracy over 99% in identifying malicious applications. The MLP, enhanced with an attention mechanism, focuses on the most discriminative features and further reduces the 47 features to only 14 components using Linear Discriminant Analysis (LDA). Despite this significant reduction in dimensionality, the SVM component, equipped with an RBF kernel, excels in mapping these components to a high-dimensional space, facilitating precise classification of malware into their respective families. Rigorous evaluations, encompassing accuracy, precision, recall, and F1-score metrics, confirm the superiority of our approach compared to existing state-of-the-art techniques. The proposed framework not only significantly reduces the computational complexity by leveraging a compact feature set but also exhibits resilience against the evolving Android malware landscape.

Decoding Android Malware with a Fraction of Features: An Attention-Enhanced MLP-SVM Approach

TL;DR

This paper tackles Android malware detection and family classification in a setting of highly obfuscated and evolving threats. It introduces a hybrid pipeline that combines an attention-enhanced MLP for robust representation learning with a Radial Basis Function SVM, using Linear Discriminant Analysis to reduce features from 512 to 14 after initial 47-feature selection from the CCCS-CIC-AndMal-2020 dataset. The approach achieves over 99% accuracy while drastically reducing feature dimensionality and providing SHAP-based explanations for model interpretability. The work demonstrates strong performance advantages over state-of-the-art methods, with implications for scalable, efficient, and explainable mobile threat detection in real-world deployments.

Abstract

The escalating sophistication of Android malware poses significant challenges to traditional detection methods, necessitating innovative approaches that can efficiently identify and classify threats with high precision. This paper introduces a novel framework that synergistically integrates an attention-enhanced Multi-Layer Perceptron (MLP) with a Support Vector Machine (SVM) to make Android malware detection and classification more effective. By carefully analyzing a mere 47 features out of over 9,760 available in the comprehensive CCCS-CIC-AndMal-2020 dataset, our MLP-SVM model achieves an impressive accuracy over 99% in identifying malicious applications. The MLP, enhanced with an attention mechanism, focuses on the most discriminative features and further reduces the 47 features to only 14 components using Linear Discriminant Analysis (LDA). Despite this significant reduction in dimensionality, the SVM component, equipped with an RBF kernel, excels in mapping these components to a high-dimensional space, facilitating precise classification of malware into their respective families. Rigorous evaluations, encompassing accuracy, precision, recall, and F1-score metrics, confirm the superiority of our approach compared to existing state-of-the-art techniques. The proposed framework not only significantly reduces the computational complexity by leveraging a compact feature set but also exhibits resilience against the evolving Android malware landscape.
Paper Structure (30 sections, 8 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 8 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Attention-Based Feature Weighting in the MLP-SVM Model for Android Malware Classification. This diagram illustrates the flow of data through various layers in the model, highlighting the integration of the attention mechanism for dynamic feature weighting.
  • Figure 2: Workflow of the integrated MLP-Attention and SVM model for Android malware classification. The diagram illustrates the sequential processing stages from input data through feature extraction, attention-based feature refinement, and SVM classification, culminating in malware classification, with a side panel detailing performance evaluation metrics including accuracy, precision, recall, F1-score, and Explainable AI (XAI) with SHAP.
  • Figure 3: Class distribution before and after applying class weights in the CCCS-CIC-AndMal-2020 dataset. The left subplot shows the original class distribution, indicating significant class imbalance. The right subplot demonstrates the adjusted class distribution, achieving a more balanced scenario through class weighting, critical for unbiased model training and evaluation.
  • Figure 4: Comparison of class distribution and F1 scores before and after class weighting. The graph clearly demonstrates that the baseline model, which exhibits higher F1 scores, may be overfitting to the majority classes, as shown by the significant fluctuations in F1 scores when class weights are adjusted.
  • Figure 5: Parallel coordinate plot illustrating the interdependencies and impact of various hyperparameters optimized during the study. Each line represents a trial, showing how hyperparameters like batch size, dropout rates, and learning rates interact to influence the objective value.
  • ...and 3 more figures