Machine Learning-based Android Intrusion Detection System
Madiha Tahreem, Ifrah Andleeb, Bilal Zahid Hussain, Arsalan Hameed
TL;DR
This paper tackles Android malware detection by framing it as a binary classification problem over APK-permission and Binder/API-related features. It compares multiple classifiers—SVM, Random Forest, Linear Discriminant Analysis, and LightGBM—on the Deep Learning for Cyber Security APK dataset (8078 samples, 70 features, 70:30 train/test split). Random Forest achieves the best performance, reporting an accuracy of 99.11% (recall 99.53%, precision 99.88%) with grid-tuned hyperparameters (e.g., 140 trees, max_depth 22). The results demonstrate the viability of behavior-based, ML-driven Android intrusion detection, though limitations in misclassified malicious samples and real-time deployment are acknowledged, signaling avenues for future work and broader datasets.
Abstract
The android operating system is being installed in most of the smart devices. The introduction of intrusions in such operating systems is rising at a tremendous rate. With the introduction of such malicious data streams, the smart devices are being subjected to various attacks like Phishing, Spyware, SMS Fraud, Bots and Banking-Trojans and many such. The application of machine learning classification algorithms for the security of android APK files is used in this paper. Each apk data stream was marked to be either malicious or non malicious on the basis of different parameters. The machine learning classification techniques are then used to classify whether the newly installed applications' signature falls within the malicious or non-malicious domain. If it falls within the malicious category, appropriate action can be taken, and the Android operating system can be shielded against illegal activities.
