Binary and Multiclass Cyberattack Classification on GeNIS Dataset
Miguel Silva, Daniela Pinto, João Vitorino, Eva Maia, Isabel Praça, Ivone Amorim, Maria João Viamonte
TL;DR
The paper evaluates GeNIS, a modular network-traffic dataset designed for AI-based NIDS, as a reliable benchmark for binary and multiclass attack classification. It combines five feature-selection methods to extract a 16-feature subset representing about 70% of total importance and compares five model families (RF, XGBoost, LightGBM, LSTM, MLP) using a 70/30 train-test split, grid-search tuning, and SHAP explanations. Key findings show ensemble tree methods achieve high accuracy and F1-scores with better generalization and efficiency than DL models, and a reduced feature subset largely preserves performance while reducing compute. The work highlights that models primarily rely on quantity-based and time-based flow features, indicating good generalization potential to other datasets and suggesting future work on larger, more diverse data and robustness to adversarial settings.
Abstract
The integration of Artificial Intelligence (AI) in Network Intrusion Detection Systems (NIDS) is a promising approach to tackle the increasing sophistication of cyberattacks. However, since Machine Learning (ML) and Deep Learning (DL) models rely heavily on the quality of their training data, the lack of diverse and up-to-date datasets hinders their generalization capability to detect malicious activity in previously unseen network traffic. This study presents an experimental validation of the reliability of the GeNIS dataset for AI-based NIDS, to serve as a baseline for future benchmarks. Five feature selection methods, Information Gain, Chi-Squared Test, Recursive Feature Elimination, Mean Absolute Deviation, and Dispersion Ratio, were combined to identify the most relevant features of GeNIS and reduce its dimensionality, enabling a more computationally efficient detection. Three decision tree ensembles and two deep neural networks were trained for both binary and multiclass classification tasks. All models reached high accuracy and F1-scores, and the ML ensembles achieved slightly better generalization while remaining more efficient than DL models. Overall, the obtained results indicate that the GeNIS dataset supports intelligent intrusion detection and cyberattack classification with time-based and quantity-based behavioral features.
