A Hypergraph-Based Machine Learning Ensemble Network Intrusion Detection System
Zong-Zhi Lin, Thomas D. Pike, Mark M. Bailey, Nathaniel D. Bastian
TL;DR
The paper tackles the challenge of real-time network intrusion detection amid evolving port-scan and adversarial threats. It introduces a hypergraph-based representation of port-scan activity and derives $s$-closeness centrality features to augment an ML ensemble comprising RF and LightGBM models. Through an online evaluation framework with adversarial data augmentation and an Update-ALL-NIDS retraining policy, the approach achieves near-perfect detection on port-scan data and the CIC-IDS2017 dataset, demonstrating robustness and resiliency. The work highlights the practical impact of incorporating hypergraph metrics into adaptive NIDS, enabling timely retraining and improved defense against sophisticated cyber threats.
Abstract
Network intrusion detection systems (NIDS) to detect malicious attacks continue to meet challenges. NIDS are often developed offline while they face auto-generated port scan infiltration attempts, resulting in a significant time lag from adversarial adaption to NIDS response. To address these challenges, we use hypergraphs focused on internet protocol addresses and destination ports to capture evolving patterns of port scan attacks. The derived set of hypergraph-based metrics are then used to train an ensemble machine learning (ML) based NIDS that allows for real-time adaption in monitoring and detecting port scanning activities, other types of attacks, and adversarial intrusions at high accuracy, precision and recall performances. This ML adapting NIDS was developed through the combination of (1) intrusion examples, (2) NIDS update rules, (3) attack threshold choices to trigger NIDS retraining requests, and (4) a production environment with no prior knowledge of the nature of network traffic. 40 scenarios were auto-generated to evaluate the ML ensemble NIDS comprising three tree-based models. The resulting ML Ensemble NIDS was extended and evaluated with the CIC-IDS2017 dataset. Results show that under the model settings of an Update-ALL-NIDS rule (specifically retrain and update all the three models upon the same NIDS retraining request) the proposed ML ensemble NIDS evolved intelligently and produced the best results with nearly 100% detection performance throughout the simulation.
