MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification

Xu-Yang Chen; Lu Han; De-Chuan Zhan; Han-Jia Ye

MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification

Xu-Yang Chen, Lu Han, De-Chuan Zhan, Han-Jia Ye

TL;DR

MIETT tackles encrypted traffic classification by moving beyond token-level analysis to capture flow-level dynamics. It introduces a Multi-Instance Encrypted Traffic Transformer with Two-Level Attention that models intra-packet and inter-packet dependencies, aided by novel pre-training tasks (Packet Relative Position Prediction and Flow Contrastive Learning) along with Masked Flow Prediction. The model leverages a frozen packet-attention backbone from prior work and learns robust flow representations through per-flow CLS tokens, achieving state-of-the-art results across five datasets. This approach improves generalization to unseen traffic and offers a scalable, flow-aware framework for encrypted traffic analysis with practical implications for security and network management.

Abstract

Network traffic includes data transmitted across a network, such as web browsing and file transfers, and is organized into packets (small units of data) and flows (sequences of packets exchanged between two endpoints). Classifying encrypted traffic is essential for detecting security threats and optimizing network management. Recent advancements have highlighted the superiority of foundation models in this task, particularly for their ability to leverage large amounts of unlabeled data and demonstrate strong generalization to unseen data. However, existing methods that focus on token-level relationships fail to capture broader flow patterns, as tokens, defined as sequences of hexadecimal digits, typically carry limited semantic information in encrypted traffic. These flow patterns, which are crucial for traffic classification, arise from the interactions between packets within a flow, not just their internal structure. To address this limitation, we propose a Multi-Instance Encrypted Traffic Transformer (MIETT), which adopts a multi-instance approach where each packet is treated as a distinct instance within a larger bag representing the entire flow. This enables the model to capture both token-level and packet-level relationships more effectively through Two-Level Attention (TLA) layers, improving the model's ability to learn complex packet dynamics and flow patterns. We further enhance the model's understanding of temporal and flow-specific dynamics by introducing two novel pre-training tasks: Packet Relative Position Prediction (PRPP) and Flow Contrastive Learning (FCL). After fine-tuning, MIETT achieves state-of-the-art (SOTA) results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors. Code is available at \url{https://github.com/Secilia-Cxy/MIETT}.

MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification

TL;DR

Abstract

MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)