Table of Contents
Fetching ...

Multi-view Correlation-aware Network Traffic Detection on Flow Hypergraph

Jiajun Zhou, Wentao Fu, Hao Song, Shanqing Yu, Qi Xuan, Xiaoniu Yang

TL;DR

This work tackles the challenge of robust network traffic detection under imbalanced labels and diverse scenarios by proposing FlowID, a framework that jointly models multi-view flow features and higher-order flow relationships. It introduces a flow hypergraph encoder (HyperGCN) to capture both flow- and group-level patterns, and augments it with dual-contrastive self-supervision to improve discrimination across traffic types. The approach demonstrates state-of-the-art performance across five real-world datasets, with ablations confirming the value of each component (multi-view features, hypergraph construction, and dual-contrastive learning) and parameter studies highlighting the importance of settings such as $K=3$, $n=40$, and $m=16$. The findings suggest FlowID offers strong generalization and online detection capabilities, making it a practical option for real-time network security and governance tasks, including encrypted and IoT traffic.

Abstract

As the Internet rapidly expands, the increasing complexity and diversity of network activities pose significant challenges to effective network governance and security regulation. Network traffic, which serves as a crucial data carrier of network activities, has become indispensable in this process. Network traffic detection aims to monitor, analyze, and evaluate the data flows transmitted across the network to ensure network security and optimize performance. However, existing network traffic detection methods generally suffer from several limitations: 1) a narrow focus on characterizing traffic features from a single perspective; 2) insufficient exploration of discriminative features for different traffic; 3) poor generalization to different traffic scenarios. To address these issues, we propose a multi-view correlation-aware framework named FlowID for network traffic detection. FlowID captures multi-view traffic features via temporal and interaction awareness, while a hypergraph encoder further explores higher-order relationships between flows. To overcome the challenges of data imbalance and label scarcity, we design a dual-contrastive proxy task, enhancing the framework's ability to differentiate between various traffic flows through traffic-to-traffic and group-to-group contrast. Extensive experiments on five real-world datasets demonstrate that FlowID significantly outperforms existing methods in accuracy, robustness, and generalization across diverse network scenarios, particularly in detecting malicious traffic.

Multi-view Correlation-aware Network Traffic Detection on Flow Hypergraph

TL;DR

This work tackles the challenge of robust network traffic detection under imbalanced labels and diverse scenarios by proposing FlowID, a framework that jointly models multi-view flow features and higher-order flow relationships. It introduces a flow hypergraph encoder (HyperGCN) to capture both flow- and group-level patterns, and augments it with dual-contrastive self-supervision to improve discrimination across traffic types. The approach demonstrates state-of-the-art performance across five real-world datasets, with ablations confirming the value of each component (multi-view features, hypergraph construction, and dual-contrastive learning) and parameter studies highlighting the importance of settings such as , , and . The findings suggest FlowID offers strong generalization and online detection capabilities, making it a practical option for real-time network security and governance tasks, including encrypted and IoT traffic.

Abstract

As the Internet rapidly expands, the increasing complexity and diversity of network activities pose significant challenges to effective network governance and security regulation. Network traffic, which serves as a crucial data carrier of network activities, has become indispensable in this process. Network traffic detection aims to monitor, analyze, and evaluate the data flows transmitted across the network to ensure network security and optimize performance. However, existing network traffic detection methods generally suffer from several limitations: 1) a narrow focus on characterizing traffic features from a single perspective; 2) insufficient exploration of discriminative features for different traffic; 3) poor generalization to different traffic scenarios. To address these issues, we propose a multi-view correlation-aware framework named FlowID for network traffic detection. FlowID captures multi-view traffic features via temporal and interaction awareness, while a hypergraph encoder further explores higher-order relationships between flows. To overcome the challenges of data imbalance and label scarcity, we design a dual-contrastive proxy task, enhancing the framework's ability to differentiate between various traffic flows through traffic-to-traffic and group-to-group contrast. Extensive experiments on five real-world datasets demonstrate that FlowID significantly outperforms existing methods in accuracy, robustness, and generalization across diverse network scenarios, particularly in detecting malicious traffic.
Paper Structure (27 sections, 13 equations, 5 figures, 7 tables)

This paper contains 27 sections, 13 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: The architecture of FlowID: 1) Multi-view traffic feature extraction via temporal feature awareness and interaction feature awareness; 2) Traffic hypergraph construction to capture higher-order flow relationships; 3) Traffic hypergraph learning using hypergraph convolution and double contrastive learning.
  • Figure 2: An example of traffic sequence and graph construction under HTTP request-response.
  • Figure 3: The impact of the number of packets ($n$) and the number of payload bytes ($m$) used in the multi-view feature extraction process on the framework's performance.
  • Figure 4: The impact of the K-value setting during flow hypergraph construction on the framework's performance.
  • Figure 5: Traffic Classifier gain (%) when contrasting different augmentation pairs, compared to FlowID which stands for a no-augmentation version of our framework, under all datasets. "Iden" represents the original view.

Theorems & Definitions (1)

  • Definition 1