Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

Pu Qiu; Chen Ouyang; Yongyang Xiong; Keyou You; Wanquan Liu; Yang Shi

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

Pu Qiu, Chen Ouyang, Yongyang Xiong, Keyou You, Wanquan Liu, Yang Shi

TL;DR

FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information, and designs a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission.

Abstract

Federated Composite Optimization (FCO) has emerged as a promising framework for training models with structural constraints (e.g., sparsity) in distributed edge networks. However, simultaneously achieving communication efficiency and convergence robustness remains a significant challenge, particularly when dealing with non-smooth regularizers, statistical heterogeneity, and the restrictions of biased compression. To address these issues, we propose FedCEF (Federated Composite Error Feedback), a novel algorithm tailored for non-convex FCO. FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information. To mitigate the noise from aggressive quantization and the bias from non-IID data, FedCEF integrates a rigorous error feedback mechanism with control variates. Furthermore, we design a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission. We theoretically establish that FedCEF achieves sublinear convergence to a bounded residual error under general non-convexity, which is controllable via the step size and batch size. Extensive experiments on real datasets validate FedCEF maintains competitive model accuracy even under extreme compression ratios (e.g., 1%), significantly reducing the total communication volume compared to uncompressed baselines.

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

TL;DR

Abstract

Paper Structure (22 sections, 6 theorems, 63 equations, 4 figures, 1 algorithm)

This paper contains 22 sections, 6 theorems, 63 equations, 4 figures, 1 algorithm.

Introduction
Related work
Contributions
Problem Formulation
The Proposed Algorithm
Decoupled Proximal Local Updates
Communication-Efficient Uplink and Downlink
Mechanism of Control Variates
Theoretical Analysis
Key Lemmas
Main Convergence Result
Experiments
Experimental Setup
Performance Evaluation on CIFAR-10
Performance on MNIST
...and 7 more sections

Key Result

Lemma 1

Suppose that Assumptions 1--4 hold, then, we have

Figures (4)

Figure 1: Test Accuracy vs. Communication Cost on CIFAR-10. FedCEF ($r=0.01$) achieves target accuracy with minimal bandwidth consumption.
Figure 2: Train Loss vs. Communication Cost on CIFAR-10. FedCEF demonstrates superior communication efficiency in loss reduction.
Figure 3: Test Accuracy vs. Communication Cost on MNIST.
Figure 4: Train Loss vs. Communication Cost on MNIST.

Theorems & Definitions (9)

Definition 1: Contractive Compressor
Remark 1: Comparison with Existing Assumptions
Lemma 1: One-step Descent
Lemma 2: Local Estimation Recursion
Lemma 3: Global Estimation Recursion
Lemma 4: Compression Error Recursion
Lemma 5: Local Client Drift
Theorem 1: Convergence Analysis
Remark 2: Analysis of Convergence and Residual Error

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

TL;DR

Abstract

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)