Multiple-Input Auto-Encoder Guided Feature Selection for IoT Intrusion Detection Systems
Phai Vu Dinh, Diep N. Nguyen, Dinh Thai Hoang, Quang Uy Nguyen, Eryk Dutkiewicz, Son Pham Bao
TL;DR
The paper addresses IoT intrusion detection in highly heterogeneous, high-dimensional data by introducing a novel multimodal autoencoder (MIAE) and its feature-selection extension (MIAF). MIAE uses multiple sub-encoders to fuse diverse feature groups into a single latent representation trained unsupervised, while MIAF adds a feature-selector that enforces sparsity via a group-lasso-like regularization to rank and prune latent features. A preprocessing algorithm based on symmetric KL divergence and hierarchical clustering creates meaningful feature groups to train separate sub-encoders, reducing training complexity and preserving group-specific information. Across NSLKDD, UNSW-NB15, and IDS2017, MIAE and MIAF outperform conventional classifiers, standard dimensionality reduction methods, and other unsupervised representation techniques, achieving high detection accuracy (e.g., 0.988 on IDS2017 with RF) and very fast inference times with model sizes under 1 MB. The work demonstrates that combining unsupervised multimodal representation learning with principled feature selection yields robust, scalable IDSs for heterogeneous IoT environments and offers paths for application to other domains with non-iid feature groups.
Abstract
While intrusion detection systems (IDSs) benefit from the diversity and generalization of IoT data features, the data diversity (e.g., the heterogeneity and high dimensions of data) also makes it difficult to train effective machine learning models in IoT IDSs. This also leads to potentially redundant/noisy features that may decrease the accuracy of the detection engine in IDSs. This paper first introduces a novel neural network architecture called Multiple-Input Auto-Encoder (MIAE). MIAE consists of multiple sub-encoders that can process inputs from different sources with different characteristics. The MIAE model is trained in an unsupervised learning mode to transform the heterogeneous inputs into lower-dimensional representation, which helps classifiers distinguish between normal behaviour and different types of attacks. To distil and retain more relevant features but remove less important/redundant ones during the training process, we further design and embed a feature selection layer right after the representation layer of MIAE resulting in a new model called MIAEFS. This layer learns the importance of features in the representation vector, facilitating the selection of informative features from the representation vector. The results on three IDS datasets, i.e., NSLKDD, UNSW-NB15, and IDS2017, show the superior performance of MIAE and MIAEFS compared to other methods, e.g., conventional classifiers, dimensionality reduction models, unsupervised representation learning methods with different input dimensions, and unsupervised feature selection models. Moreover, MIAE and MIAEFS combined with the Random Forest (RF) classifier achieve accuracy of 96.5% in detecting sophisticated attacks, e.g., Slowloris. The average running time for detecting an attack sample using RF with the representation of MIAE and MIAEFS is approximate 1.7E-6 seconds, whilst the model size is lower than 1 MB.
