Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification
Bruna Bazaluk, Mosab Hamdan, Mustafa Ghaleb, Mohammed S. M. Gismalla, Flavio S. Correa da Silva, Daniel Macêdo Batista
TL;DR
This work presents ITCT, a Transformer-based IoT Traffic Classification Transformer built on TabTransformer and pre-trained on a large MQTT-IoT dataset to enable effective fine-tuning with limited labeled data. By combining categorical feature embeddings, Transformer contextualization, and an MLP classifier, ITCT achieves around $82\%$ overall accuracy on MQTT-IoT-IDS2020 while balancing computational efficiency. The study demonstrates that pretraining on large IoT datasets, along with selective feature processing, yields robust performance with favorable resource usage, and the authors provide publicly available code and plan a HuggingFace release to facilitate broad adoption. The approach offers a practical path toward scalable, accurate IoT traffic classification in real-world networks.
Abstract
The classification of IoT traffic is important to improve the efficiency and security of IoT-based networks. As the state-of-the-art classification methods are based on Deep Learning, most of the current results require a large amount of data to be trained. Thereby, in real-life situations, where there is a scarce amount of IoT traffic data, the models would not perform so well. Consequently, these models underperform outside their initial training conditions and fail to capture the complex characteristics of network traffic, rendering them inefficient and unreliable in real-world applications. In this paper, we propose IoT Traffic Classification Transformer (ITCT), a novel approach that utilizes the state-of-the-art transformer-based model named TabTransformer. ITCT, which is pre-trained on a large labeled MQTT-based IoT traffic dataset and may be fine-tuned with a small set of labeled data, showed promising results in various traffic classification tasks. Our experiments demonstrated that the ITCT model significantly outperforms existing models, achieving an overall accuracy of 82%. To support reproducibility and collaborative development, all associated code has been made publicly available.
