Table of Contents
Fetching ...

Interpreting Manifolds and Graph Neural Embeddings from Internet of Things Traffic Flows

Enrique Feito-Casares, Francisco M. Melgarejo-Meseguer, Elena Casiraghi, Giorgio Valentini, José-Luis Rojo-Álvarez

TL;DR

This work addresses the challenge of interpreting high-dimensional GNN embeddings for IoT traffic by embedding them onto a low-dimensional latent manifold using a jointly trained, MnL-informed framework. It combines a GIN-based graph representation with a parametric P-UMAP projection and SHAP-based feature attribution to produce directly visualizable embeddings that preserve topology and offer explanations. The approach achieves a Binary F1 score of 0.830 for intrusion detection and reveals concept drift phenomena where evolving botnet behavior mimics DoS patterns, highlighting both strengths and challenges of multiclass separation. Practically, the method enables interpretable network monitoring and interoperability in dynamic IoT environments, guiding security analysts and administrators through topological and semantic shifts in traffic behavior.

Abstract

The rapid expansion of Internet of Things (IoT) ecosystems has led to increasingly complex and heterogeneous network topologies. Traditional network monitoring and visualization tools rely on aggregated metrics or static representations, which fail to capture the evolving relationships and structural dependencies between devices. Although Graph Neural Networks (GNNs) offer a powerful way to learn from relational data, their internal representations often remain opaque and difficult to interpret for security-critical operations. Consequently, this work introduces an interpretable pipeline that generates directly visualizable low-dimensional representations by mapping high-dimensional embeddings onto a latent manifold. This projection enables the interpretable monitoring and interoperability of evolving network states, while integrated feature attribution techniques decode the specific characteristics shaping the manifold structure. The framework achieves a classification F1-score of 0.830 for intrusion detection while also highlighting phenomena such as concept drift. Ultimately, the presented approach bridges the gap between high-dimensional GNN embeddings and human-understandable network behavior, offering new insights for network administrators and security analysts.

Interpreting Manifolds and Graph Neural Embeddings from Internet of Things Traffic Flows

TL;DR

This work addresses the challenge of interpreting high-dimensional GNN embeddings for IoT traffic by embedding them onto a low-dimensional latent manifold using a jointly trained, MnL-informed framework. It combines a GIN-based graph representation with a parametric P-UMAP projection and SHAP-based feature attribution to produce directly visualizable embeddings that preserve topology and offer explanations. The approach achieves a Binary F1 score of 0.830 for intrusion detection and reveals concept drift phenomena where evolving botnet behavior mimics DoS patterns, highlighting both strengths and challenges of multiclass separation. Practically, the method enables interpretable network monitoring and interoperability in dynamic IoT environments, guiding security analysts and administrators through topological and semantic shifts in traffic behavior.

Abstract

The rapid expansion of Internet of Things (IoT) ecosystems has led to increasingly complex and heterogeneous network topologies. Traditional network monitoring and visualization tools rely on aggregated metrics or static representations, which fail to capture the evolving relationships and structural dependencies between devices. Although Graph Neural Networks (GNNs) offer a powerful way to learn from relational data, their internal representations often remain opaque and difficult to interpret for security-critical operations. Consequently, this work introduces an interpretable pipeline that generates directly visualizable low-dimensional representations by mapping high-dimensional embeddings onto a latent manifold. This projection enables the interpretable monitoring and interoperability of evolving network states, while integrated feature attribution techniques decode the specific characteristics shaping the manifold structure. The framework achieves a classification F1-score of 0.830 for intrusion detection while also highlighting phenomena such as concept drift. Ultimately, the presented approach bridges the gap between high-dimensional GNN embeddings and human-understandable network behavior, offering new insights for network administrators and security analysts.
Paper Structure (25 sections, 23 equations, 5 figures, 7 tables)

This paper contains 25 sections, 23 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overview of the proposed pipeline for IoT network topology and traffic flow representation. The system ingests raw network traffic and performs canonical flow identification via lexicographical ordering. It then constructs a multigraph representation and learns joint embeddings via a coupled GNN and P-UMAP architecture, and applies SHAP for feature attribution to explain the resulting visual insights.
  • Figure 2: Schematic of the proposed architecture integrating coupled GIN and P-UMAP for joint device and flow embedding. The model encodes node and edge features, enforces topological consistency, and reconstructs original attributes via dual decoders in an unsupervised setting, or classifies edges minimizing an asymmetric loss in a supervised setting.
  • Figure 3: Comparison of Ground Truth, Model Predictions, and Misclassification distribution. Top Row (Binary): Demonstrates clear separability between Benign and Attack traffic, with minimal errors (c). Bottom Row (Multiclass): Reveals structural overlaps between attack classes (e.g., Mirai and DoS), where errors are densely concentrated (f), indicating semantic ambiguity rather than model failure.
  • Figure 4: Evolution of latent embeddings across three temporal partitions (Mirai vs Dos) reveals the mechanism of performance degradation. (a) Initially, Mirai and DoS are topologically distinct. (b) The Mirai cluster begins to migrate towards the DoS region. (c) In the final stage, Mirai exhibits mimetic behavior, structurally overlapping with DoS. Red markers highlight misclassified instances, demonstrating that model errors are non-random and localized in the intersection zone between Mirai and DoS.
  • Figure 5: Multi-level interpretability dashboard of the learned GNN latent space. (Center) UMAP projection delineating density regions for DoS (Pink) and Mirai (Cyan). (Right) Global feature importance ranking for pure and intersection zones. (Left) Local SHAP feature attribution.