An Efficient Gradient-Aware Error-Bounded Lossy Compressor for Federated Learning
Zhijing Ye, Sheng Di, Jiamin Wang, Zhiqing Zhong, Zhaorui Zhang, Xiaodong Yu
TL;DR
This work tackles the core FL bottleneck of gradient communication by introducing a gradient-aware error-bounded lossy compressor. It replaces generic EBLC predictors with a cross-round magnitude predictor based on EMA normalization and a kernel-level sign predictor that exploits temporal dynamics and intra-kernel sign consistency, including a compact two-level bitmap metadata scheme. The method integrates into APPFL and achieves up to 1.53x higher compression than SZ3 and 2.11x than QSGD, translating to 76.1%–96.2% end-to-end communication time savings under constrained bandwidth while preserving accuracy. The approach is shown to be robust across CNN architectures and datasets, with detailed layer- and kernel-size analyses, and is positioned as a practical, scalable solution for real-world FL deployments. Future work includes auto-tuning of hyperparameters and GPU-optimized implementations, with potential to combine with sparsification and other model-compression techniques.
Abstract
Federated learning (FL) enables collaborative model training without exposing clients' private data, but its deployment is often constrained by the communication cost of transmitting gradients between clients and the central server, especially under system heterogeneity where low-bandwidth clients bottleneck overall performance. Lossy compression of gradient data can mitigate this overhead, and error-bounded lossy compression (EBLC) is particularly appealing for its fine-grained utility-compression tradeoff. However, existing EBLC methods (e.g., SZ), originally designed for smooth scientific data with strong spatial locality, rely on generic predictors such as Lorenzo and interpolation for entropy reduction to improve compression ratio. Gradient tensors, in contrast, exhibit low smoothness and weak spatial correlation, rendering these predictors ineffective and leading to poor compression ratios. To address this limitation, we propose an EBLC framework tailored for FL gradient data to achieve high compression ratios while preserving model accuracy. The core of it is an innovative prediction mechanism that exploits temporal correlations across FL training rounds and structural regularities within convolutional kernels to reduce residual entropy. The predictor is compatible with standard quantizers and entropy coders and comprises (1) a cross-round magnitude predictor based on a normalized exponential moving average, and (2) a sign predictor that leverages gradient oscillation and kernel-level sign consistency. Experiments show that this new EBLC yields up to 1.53x higher compression ratios than SZ3 with lower accuracy loss. Integrated into a real-world FL framework, APPFL, it reduces end-to-end communication time by 76.1%-96.2% under various constrained-bandwidth scenarios, demonstrating strong scalability for real-world FL deployments.
