Table of Contents
Fetching ...

CBR -- Boosting Adaptive Classification By Retrieval of Encrypted Network Traffic with Out-of-distribution

Amir Lukach, Ran Dubin, Amit Dvir, Chen Hajaj

TL;DR

The paper addresses encrypted network traffic classification and the pressing need to adapt to unseen classes without costly retraining. It proposes Adaptive Classification By Retrieval (CBR), an ANN-based retrieval framework that uses a vector database (ElasticSearch) and semi-supervised out-of-distribution detection to classify known and new classes in real time. Key contributions include demonstrating near-RF level accuracy on BOA and MTA datasets with few-shot new-class additions, providing extensive comparisons across feature sets and ANN search algorithms, and offering practical deployment guidance for integrating CBR with existing classifiers. The work presents a scalable, encryption-friendly approach that reduces retraining overhead while maintaining robust performance, and it lays the groundwork for broader adoption of few-shot learning and OOD handling in network traffic classification.

Abstract

Encrypted network traffic Classification tackles the problem from different approaches and with different goals. One of the common approaches is using Machine learning or Deep Learning-based solutions on a fixed number of classes, leading to misclassification when an unknown class is given as input. One of the solutions for handling unknown classes is to retrain the model, however, retraining models every time they become obsolete is both resource and time-consuming. Therefore, there is a growing need to allow classification models to detect and adapt to new classes dynamically, without retraining, but instead able to detect new classes using few shots learning [1]. In this paper, we introduce Adaptive Classification By Retrieval CBR, a novel approach for encrypted network traffic classification. Our new approach is based on an ANN-based method, which allows us to effectively identify new and existing classes without retraining the model. The novel approach is simple, yet effective and achieved similar results to RF with up to 5% difference (usually less than that) in the classification tasks while having a slight decrease in the case of new samples (from new classes) without retraining. To summarize, the new method is a real-time classification, which can classify new classes without retraining. Furthermore, our solution can be used as a complementary solution alongside RF or any other machine/deep learning classification method, as an aggregated solution.

CBR -- Boosting Adaptive Classification By Retrieval of Encrypted Network Traffic with Out-of-distribution

TL;DR

The paper addresses encrypted network traffic classification and the pressing need to adapt to unseen classes without costly retraining. It proposes Adaptive Classification By Retrieval (CBR), an ANN-based retrieval framework that uses a vector database (ElasticSearch) and semi-supervised out-of-distribution detection to classify known and new classes in real time. Key contributions include demonstrating near-RF level accuracy on BOA and MTA datasets with few-shot new-class additions, providing extensive comparisons across feature sets and ANN search algorithms, and offering practical deployment guidance for integrating CBR with existing classifiers. The work presents a scalable, encryption-friendly approach that reduces retraining overhead while maintaining robust performance, and it lays the groundwork for broader adoption of few-shot learning and OOD handling in network traffic classification.

Abstract

Encrypted network traffic Classification tackles the problem from different approaches and with different goals. One of the common approaches is using Machine learning or Deep Learning-based solutions on a fixed number of classes, leading to misclassification when an unknown class is given as input. One of the solutions for handling unknown classes is to retrain the model, however, retraining models every time they become obsolete is both resource and time-consuming. Therefore, there is a growing need to allow classification models to detect and adapt to new classes dynamically, without retraining, but instead able to detect new classes using few shots learning [1]. In this paper, we introduce Adaptive Classification By Retrieval CBR, a novel approach for encrypted network traffic classification. Our new approach is based on an ANN-based method, which allows us to effectively identify new and existing classes without retraining the model. The novel approach is simple, yet effective and achieved similar results to RF with up to 5% difference (usually less than that) in the classification tasks while having a slight decrease in the case of new samples (from new classes) without retraining. To summarize, the new method is a real-time classification, which can classify new classes without retraining. Furthermore, our solution can be used as a complementary solution alongside RF or any other machine/deep learning classification method, as an aggregated solution.
Paper Structure (15 sections, 6 figures, 9 tables)

This paper contains 15 sections, 6 figures, 9 tables.

Figures (6)

  • Figure 1: CBR Architecture
  • Figure 2: ANN with OOD
  • Figure 3: CBR accuracy results as a function of the number of packets - BOA dataset. As can be seen, the accuracy stops improving after 10 packets.
  • Figure 4: CBR with OOD. Classify samples as new classes or OODs according to their distances from the nearest classes.
  • Figure 5: CBR accuracy results as a function of the number of packets - MTA dataset. Notice, that after 98 packets, the accuracy does not improve.
  • ...and 1 more figures