HFL-FlowLLM: Large Language Models for Network Traffic Flow Classification in Heterogeneous Federated Learning
Jiazhuo Tian, Yachao Yuan
TL;DR
The paper tackles traffic flow classification in privacy-preserving heterogeneous federated learning settings common in 5G/IoT networks. It introduces HFL-FlowLLM, a framework that repurposes large language models by replacing the autoregressive head with a network head, compressing the model via layer dropping, and fine-tuning only near the output using LoRA, while deploying a noise-free, stacking-based aggregation and adaptive client training. The approach yields substantial gains over both HFL baselines and prior LLM-FL frameworks, achieving around $13\%$ higher average F1 than non-LLM HFL methods and up to $5\%$ improvements over other LLM-FL frameworks as client participation grows, with training costs reduced by about $87\%$. The results across five public datasets demonstrate strong accuracy, generalization, and robustness under non-IID conditions, highlighting the practical value of integrating LLMs into distributed network security workflows.
Abstract
In modern communication networks driven by 5G and the Internet of Things (IoT), effective network traffic flow classification is crucial for Quality of Service (QoS) management and security. Traditional centralized machine learning struggles with the distributed data and privacy concerns in these heterogeneous environments, while existing federated learning approaches suffer from high costs and poor generalization. To address these challenges, we propose HFL-FlowLLM, which to our knowledge is the first framework to apply large language models to network traffic flow classification in heterogeneous federated learning. Compared to state-of-the-art heterogeneous federated learning methods for network traffic flow classification, the proposed approach improves the average F1 score by approximately 13%, demonstrating compelling performance and strong robustness. When compared to existing large language models federated learning frameworks, as the number of clients participating in each training round increases, the proposed method achieves up to a 5% improvement in average F1 score while reducing the training costs by about 87%. These findings prove the potential and practical value of HFL-FlowLLM in modern communication networks security.
