Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction

Chuanting Zhang; Haixia Zhang; Shuping Dang; Basem Shihada; Mohamed-Slim Alouini

Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction

Chuanting Zhang, Haixia Zhang, Shuping Dang, Basem Shihada, Mohamed-Slim Alouini

TL;DR

This work tackles wireless traffic prediction at the network edge by integrating gradient compression with correlation-driven federated learning. It introduces gradient sparsification augmented by error feedback and gradient tracking to reduce communication while preserving accuracy, and leverages a gradient-correlation matrix to create three personalized aggregation strategies (k-relevant, $\delta$-threshold, all-correlated) that capture spatial dependencies among SCUs. Experiments on Milan and Trentino datasets show the proposed method achieving state-of-the-art prediction performance with up to two orders of magnitude improvement in communication efficiency compared to baselines. The approach enables scalable, privacy-preserving, spatially aware edge intelligence for proactive network management in future wireless systems.

Abstract

Wireless traffic prediction plays an indispensable role in cellular networks to achieve proactive adaptation for communication systems. Along this line, Federated Learning (FL)-based wireless traffic prediction at the edge attracts enormous attention because of the exemption from raw data transmission and enhanced privacy protection. However FL-based wireless traffic prediction methods still rely on heavy data transmissions between local clients and the server for local model updates. Besides, how to model the spatial dependencies of local clients under the framework of FL remains uncertain. To tackle this, we propose an innovative FL algorithm that employs gradient compression and correlation-driven techniques, effectively minimizing data transmission load while preserving prediction accuracy. Our approach begins with the introduction of gradient sparsification in wireless traffic prediction, allowing for significant data compression during model training. We then implement error feedback and gradient tracking methods to mitigate any performance degradation resulting from this compression. Moreover, we develop three tailored model aggregation strategies anchored in gradient correlation, enabling the capture of spatial dependencies across diverse clients. Experiments have been done with two real-world datasets and the results demonstrate that by capturing the spatio-temporal characteristics and correlation among local clients, the proposed algorithm outperforms the state-of-the-art algorithms and can increase the communication efficiency by up to two orders of magnitude without losing prediction accuracy. Code is available at https://github.com/chuanting/FedGCC.

Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction

TL;DR

-threshold, all-correlated) that capture spatial dependencies among SCUs. Experiments on Milan and Trentino datasets show the proposed method achieving state-of-the-art prediction performance with up to two orders of magnitude improvement in communication efficiency compared to baselines. The approach enables scalable, privacy-preserving, spatially aware edge intelligence for proactive network management in future wireless systems.

Abstract

Paper Structure (23 sections, 15 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 23 sections, 15 equations, 8 figures, 2 tables, 1 algorithm.

Introduction
Related Works
System Model and Problem Formulation
Our Proposed Method
Key Insights
Local Update on the Client
Personalized Aggregation Strategies with Gradient Correlation
$k$-relevant strategy
$\delta$-threshold strategy
All-correlated strategy
Experiment Results
Datasets Description
Baseline Algorithms and Evaluation Metrics
Experiment Settings
Overall Prediction Performance
...and 8 more sections

Figures (8)

Figure 1: An example on model aggregation using FedAvg algorithm: (a) Milan city boundary and three selected places; (b) Temporal dynamics of the three selected places; (c) Aggregate models trained with similar traffic patterns improves performance i.e., mean squared error (MSE) achieved by $\theta w_1 + (1-\theta) w_2$ is lower than that by either $w_1$ or $w_2$; (d) Aggregate models trained with distinct traffic patterns brings no performance improvements.
Figure 2: Left: The architecture of distributed autonomous networks (DAN); Right: An abstract and simplified DAN for wireless traffic prediction. DAN is a promising 6G networking architecture that supports intelligent network elements such as network data analytics function (NWDAF). In our wireless traffic prediction system, $M$ SCUs train a robust prediction model collaboratively under the orchestration of a central cloud server in a communication-efficient way.
Figure 3: A demonstration of gradient sparsification. Original gradient vector $g_t^m$ with 5 elements (left) and the corresponding compressed version when $\gamma=0.2$ (middle) and $\gamma=0.4$ (right), respectively. Note that shadowed elements are those that will be transferred and $\gamma=1.0$ indicates no compression.
Figure 4: A toy example of our personalized aggregation strategies on client D. We set $k=2$ in the $k$-relevant strategy and $\delta>0.8$ in the $\delta$-threshold strategy. Besides, $\tilde{\rho}$ denotes the softmax version of $\rho$.
Figure 5: Average traffic distribution of the two real-world datasets. (a) Milan; (b) Trentino. The darker the color, the larger the traffic volume.
...and 3 more figures

Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction

TL;DR

Abstract

Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (8)