Table of Contents
Fetching ...

Packet Inspection Transformer: A Self-Supervised Journey to Unseen Malware Detection with Few Samples

Kyle Stein, Arash Mahyari, Guillermo Francia, Eman El-Sheikh

TL;DR

This paper tackles unseen malware detection in DPI by leveraging a byte-level transformer pre-trained with self-supervised masked language modeling on unlabeled packet bytes, producing embeddings usable for both supervised and few-shot downstream tasks. It demonstrates strong binary and multiclass malware classification performance on UNSW-NB15 and CIC-IoT23, and shows competitive few-shot unseen-class accuracy with minimal labeled data. A key finding is that a simple ML classifier on top of robust SSL embeddings can outperform more complex CNN/LSTM baselines, enabling real-time deployment. The work also analyzes encrypted traffic limitations and outlines future directions in continual learning and edge deployment to enhance practical applicability.

Abstract

As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional monitoring techniques. DPI not only examines the metadata of network packets, but also dives into the actual content being carried within the packet payloads, providing a comprehensive view of the data flowing through networks. While the integration of advanced deep learning techniques with DPI has introduced modern methodologies into malware detection and network traffic classification, state-of-the-art supervised learning approaches are limited by their reliance on large amounts of annotated data and their inability to generalize to novel, unseen malware threats. To address these limitations, this paper leverages the recent advancements in self-supervised learning (SSL) and few-shot learning (FSL). Our proposed self-supervised approach trains a transformer via SSL to learn the embedding of packet content, including payload, from vast amounts of unlabeled data by masking portions of packets, leading to a learned representation that generalizes to various downstream tasks. Once the representation is extracted from the packets, they are used to train a malware detection algorithm. The representation obtained from the transformer is then used to adapt the malware detector to novel types of attacks using few-shot learning approaches. Our experimental results demonstrate that our method achieves classification accuracies of up to 94.76% on the UNSW-NB15 dataset and 83.25% on the CIC-IoT23 dataset.

Packet Inspection Transformer: A Self-Supervised Journey to Unseen Malware Detection with Few Samples

TL;DR

This paper tackles unseen malware detection in DPI by leveraging a byte-level transformer pre-trained with self-supervised masked language modeling on unlabeled packet bytes, producing embeddings usable for both supervised and few-shot downstream tasks. It demonstrates strong binary and multiclass malware classification performance on UNSW-NB15 and CIC-IoT23, and shows competitive few-shot unseen-class accuracy with minimal labeled data. A key finding is that a simple ML classifier on top of robust SSL embeddings can outperform more complex CNN/LSTM baselines, enabling real-time deployment. The work also analyzes encrypted traffic limitations and outlines future directions in continual learning and edge deployment to enhance practical applicability.

Abstract

As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional monitoring techniques. DPI not only examines the metadata of network packets, but also dives into the actual content being carried within the packet payloads, providing a comprehensive view of the data flowing through networks. While the integration of advanced deep learning techniques with DPI has introduced modern methodologies into malware detection and network traffic classification, state-of-the-art supervised learning approaches are limited by their reliance on large amounts of annotated data and their inability to generalize to novel, unseen malware threats. To address these limitations, this paper leverages the recent advancements in self-supervised learning (SSL) and few-shot learning (FSL). Our proposed self-supervised approach trains a transformer via SSL to learn the embedding of packet content, including payload, from vast amounts of unlabeled data by masking portions of packets, leading to a learned representation that generalizes to various downstream tasks. Once the representation is extracted from the packets, they are used to train a malware detection algorithm. The representation obtained from the transformer is then used to adapt the malware detector to novel types of attacks using few-shot learning approaches. Our experimental results demonstrate that our method achieves classification accuracies of up to 94.76% on the UNSW-NB15 dataset and 83.25% on the CIC-IoT23 dataset.
Paper Structure (41 sections, 8 equations, 8 figures, 8 tables, 2 algorithms)

This paper contains 41 sections, 8 equations, 8 figures, 8 tables, 2 algorithms.

Figures (8)

  • Figure 1: Network packet structure.
  • Figure 2: Self-supervised learning with Masked Language Modeling.
  • Figure 3: The overall architecture of the proposed packet detection algorithm.
  • Figure 4: Comparison of Accuracy and Inference Time UNSW-NB15 and CIC-IoT23 Datasets: Accuracy trends for different untrained classes are displayed alongside average inference time per episode for each number of shots.
  • Figure 5: Comparison of accuracies of benchmarks vs. few-shot classification.
  • ...and 3 more figures