One Train for Two Tasks: An Encrypted Traffic Classification Framework Using Supervised Contrastive Learning
Haozhen Zhang, Xi Xiao, Le Yu, Qing Li, Zhen Ling, Ye Zhang
TL;DR
This work tackles encrypted traffic classification by unifying packet-level and flow-level tasks within a single model. It introduces CLE-TFE, which combines supervised contrastive learning at dual levels with graph-based byte-level augmentation and cross-level multi-task training to produce robust representations while significantly reducing computation compared to large pre-trained models. Key contributions include a novel dual-level contrastive framework, a cross-level training strategy that leverages packet information to enhance flow representations, and comprehensive experiments on ISCX VPN/ Tor datasets showing leading performance with minimal parameter overhead. The approach has practical impact for efficient, robust encrypted traffic classification in real-world network monitoring systems.
Abstract
As network security receives widespread attention, encrypted traffic classification has become the current research focus. However, existing methods conduct traffic classification without sufficiently considering the common characteristics between data samples, leading to suboptimal performance. Moreover, they train the packet-level and flow-level classification tasks independently, which is redundant because the packet representations learned in the packet-level task can be exploited by the flow-level task. Therefore, in this paper, we propose an effective model named a Contrastive Learning Enhanced Temporal Fusion Encoder (CLE-TFE). In particular, we utilize supervised contrastive learning to enhance the packet-level and flow-level representations and perform graph data augmentation on the byte-level traffic graph so that the fine-grained semantic-invariant characteristics between bytes can be captured through contrastive learning. We also propose cross-level multi-task learning, which simultaneously accomplishes the packet-level and flow-level classification tasks in the same model with one training. Further experiments show that CLE-TFE achieves the best overall performance on the two tasks, while its computational overhead (i.e., floating point operations, FLOPs) is only about 1/14 of the pre-trained model (e.g., ET-BERT). We release the code at https://github.com/ViktorAxelsen/CLE-TFE
