Table of Contents
Fetching ...

Drift-oriented Self-evolving Encrypted Traffic Application Classification for Actual Network Environment

Zihan Chen, Guang Cheng, Jinhui Li, Tian Qin, Yuyang Zhou, Xing Luan

TL;DR

This work tackles the problem of rapid feature concept drift in encrypted traffic classification within actual networks where applications update frequently. It proposes a drift-oriented self-evolving fine-tuning framework that uses a windowed multi-threshold accumulation measure to detect drift and a Laida-criterion-based silver-sample strategy to perform Fully Fine-Tuning without labeled data. The approach is model-agnostic and demonstrated on LS-LSTM with FFT, extending the classifier life to over eight months and achieving about a 9% F1-score improvement on subsequent data. The results indicate substantial reductions in retraining costs and improved resilience to continual internet updates, suggesting practical viability for real-world network management and security.

Abstract

Encrypted traffic classification technology is a crucial decision-making information source for network management and security protection. It has the advantages of excellent response timeliness, large-scale data bearing, and cross-time-and-space analysis. The existing research on encrypted traffic classification has gradually transitioned from the closed world to the open world, and many classifier optimization and feature engineering schemes have been proposed. However, encrypted traffic classification has yet to be effectively applied to the actual network environment. The main reason is that applications on the Internet are constantly updated, including function adjustment and version change, which brings severe feature concept drift, resulting in rapid failure of the classifier. Hence, the entire model must be retrained only past very fast time, with unacceptable labeled sample constructing and model training cost. To solve this problem, we deeply study the characteristics of Internet application updates, associate them with feature concept drift, and then propose self-evolving encrypted traffic classification. We propose a feature concept drift determination method and a drift-oriented self-evolving fine-tuning method based on the Laida criterion to adapt to all applications that are likely to be updated. In the case of no exact label samples, the classifier evolves through fully fine-tuning continuously, and the time interval between two necessary retraining is greatly extended to be applied to the actual network environment. Experiments show that our approach significantly improves the classification performance of the original classifier on the following stage dataset of the following months (9\% improvement on F1-score) without any hard-to-acquire labeled sample. Under the current experimental environment, the life of the classifier is extended to more than eight months.

Drift-oriented Self-evolving Encrypted Traffic Application Classification for Actual Network Environment

TL;DR

This work tackles the problem of rapid feature concept drift in encrypted traffic classification within actual networks where applications update frequently. It proposes a drift-oriented self-evolving fine-tuning framework that uses a windowed multi-threshold accumulation measure to detect drift and a Laida-criterion-based silver-sample strategy to perform Fully Fine-Tuning without labeled data. The approach is model-agnostic and demonstrated on LS-LSTM with FFT, extending the classifier life to over eight months and achieving about a 9% F1-score improvement on subsequent data. The results indicate substantial reductions in retraining costs and improved resilience to continual internet updates, suggesting practical viability for real-world network management and security.

Abstract

Encrypted traffic classification technology is a crucial decision-making information source for network management and security protection. It has the advantages of excellent response timeliness, large-scale data bearing, and cross-time-and-space analysis. The existing research on encrypted traffic classification has gradually transitioned from the closed world to the open world, and many classifier optimization and feature engineering schemes have been proposed. However, encrypted traffic classification has yet to be effectively applied to the actual network environment. The main reason is that applications on the Internet are constantly updated, including function adjustment and version change, which brings severe feature concept drift, resulting in rapid failure of the classifier. Hence, the entire model must be retrained only past very fast time, with unacceptable labeled sample constructing and model training cost. To solve this problem, we deeply study the characteristics of Internet application updates, associate them with feature concept drift, and then propose self-evolving encrypted traffic classification. We propose a feature concept drift determination method and a drift-oriented self-evolving fine-tuning method based on the Laida criterion to adapt to all applications that are likely to be updated. In the case of no exact label samples, the classifier evolves through fully fine-tuning continuously, and the time interval between two necessary retraining is greatly extended to be applied to the actual network environment. Experiments show that our approach significantly improves the classification performance of the original classifier on the following stage dataset of the following months (9\% improvement on F1-score) without any hard-to-acquire labeled sample. Under the current experimental environment, the life of the classifier is extended to more than eight months.
Paper Structure (12 sections, 1 figure, 3 tables)