Twin Auto-Encoder Model for Learning Separable Representation in Cyberattack Detection
Phai Vu Dinh, Quang Uy Nguyen, Thai Hoang Dinh, Diep N. Nguyen, Bao Son Pham, Eryk Dutkiewicz
TL;DR
This work introduces the Twin Auto-Encoder (TAE), a representation-learning model for cyberattack detection that eliminates posterior collapse by replacing stochastic sampling with a deterministic transformation that creates separable representation targets. By mapping inputs to a latent space and then shifting class-specific representations away from each other via a transformation operator, TAE generates dynamic codewords that improve downstream classifier performance while maintaining compact model size (~1 MB) and fast inference. Empirical results across IoT botnets, network IDS, malware, cloud DDoS, and artificial datasets show about a 2% gain in accuracy and F1-score over state-of-the-art RL methods (e.g., MAE, CTVAE) and robustness to increasing class counts. TAE’s reconstruction representations exhibit clear inter-class separation, supporting effective detection of sophisticated and unknown attacks with practical computational efficiency for IoT security contexts.
Abstract
Representation learning (RL) methods for cyberattack detection face the diversity and sophistication of attack data, leading to the issue of mixed representations of different classes, particularly as the number of classes increases. To address this, the paper proposes a novel deep learning architecture/model called the Twin Auto-Encoder (TAE). TAE first maps the input data into latent space and then deterministically shifts data samples of different classes further apart to create separable data representations, referred to as representation targets. TAE's decoder then projects the input data into these representation targets. After training, TAE's decoder extracts data representations. TAE's representation target serves as a novel dynamic codeword, which refers to the vector that represents a specific class. This vector is updated after each training epoch for every data sample, in contrast to the conventional fixed codeword that does not incorporate information from the input data. We conduct extensive experiments on diverse cybersecurity datasets, including seven IoT botnet datasets, two network IDS datasets, three malware datasets, one cloud DDoS dataset, and ten artificial datasets as the number of classes increases. TAE boosts accuracy and F-score in attack detection by around 2% compared to state-of-the-art models, achieving up to 96.1% average accuracy in IoT attack detection. Additionally, TAE is well-suited for cybersecurity applications and potentially for IoT systems, with a model size of approximately 1 MB and an average running time of around 2.6E-07 seconds for extracting a data sample.
