Individual Packet Features are a Risk to Model Generalisation in ML-Based Intrusion Detection
Kahraman Kostas, Mike Just, Michael A. Lones
TL;DR
This work critiques the reliance on Individual Packet Features (IPF) for ML-based IoT intrusion detection by showing that IPF can produce misleadingly high accuracy due to information leakage and low data complexity. Through literature review and experiments on the IoT-NID dataset, it demonstrates that session-based identifiers and simple features can inflate performance within cross-validation but fail to generalize across datasets or real deployments. The authors advocate for incorporating packet interactions and contextual, flow-based/window-based features to improve robustness and generalization in IoT security. The study provides empirical evidence and practical guidance for designing more reliable IDS that withstand dataset shifts in diverse IoT environments.
Abstract
Machine learning is increasingly used for intrusion detection in IoT networks. This paper explores the effectiveness of using individual packet features (IPF), which are attributes extracted from a single network packet, such as timing, size, and source-destination information. Through literature review and experiments, we identify the limitations of IPF, showing they can produce misleadingly high detection rates. Our findings emphasize the need for approaches that consider packet interactions for robust intrusion detection. Additionally, we demonstrate that models based on IPF often fail to generalize across datasets, compromising their reliability in diverse IoT environments.
