CGGM: A conditional graph generation model with adaptive sparsity for node anomaly detection in IoT networks
Munan Li, Xianshi Su, Runze Ma, Tongbang Jiang, Zijian Li, Tony Q. S. Quek
TL;DR
The paper tackles imbalanced node anomaly detection in IoT networks by introducing CGGM, a conditional graph generation framework that synthesizes minority-class graph snapshots to balance data for downstream detection. CGGM combines an adaptive sparsity adjacency generator with a self-attention–based multi-dimensional feature encoder in a GAN setup, and enforces a latent-space constraint to better match real data distributions. A TDG-based data pipeline feeds a GNN-based anomaly detector, enabling both binary and multi-class detection improvements. Extensive experiments on UNSW-NB15 and CICIDS-2017 show CGGM achieves higher distributional similarity to real data and superior classification performance compared with baselines like CTGAN, TableGAN, GraphRNN, and GraphSGAN, highlighting its practical potential for robust IoT security in imbalanced settings.
Abstract
Dynamic graphs are extensively employed for detecting anomalous behavior in nodes within the Internet of Things (IoT). Graph generative models are often used to address the issue of imbalanced node categories in dynamic graphs. Nevertheless, the constraints it faces include the monotonicity of adjacency relationships, the difficulty in constructing multi-dimensional features for nodes, and the lack of a method for end-to-end generation of multiple categories of nodes. In this paper, we propose a novel graph generation model, called CGGM, specifically for generating samples belonging to the minority class. The framework consists two core module: a conditional graph generation module and a graph-based anomaly detection module. The generative module adapts to the sparsity of the matrix by downsampling a noise adjacency matrix, and incorporates a multi-dimensional feature encoder based on multi-head self-attention to capture latent dependencies among features. Additionally, a latent space constraint is combined with the distribution distance to approximate the latent distribution of real data. The graph-based anomaly detection module utilizes the generated balanced dataset to predict the node behaviors. Extensive experiments have shown that CGGM outperforms the state-of-the-art methods in terms of accuracy and divergence. The results also demonstrate CGGM can generated diverse data categories, that enhancing the performance of multi-category classification task.
