Table of Contents
Fetching ...

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

Pu Qiu, Chen Ouyang, Yongyang Xiong, Keyou You, Wanquan Liu, Yang Shi

TL;DR

FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information, and designs a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission.

Abstract

Federated Composite Optimization (FCO) has emerged as a promising framework for training models with structural constraints (e.g., sparsity) in distributed edge networks. However, simultaneously achieving communication efficiency and convergence robustness remains a significant challenge, particularly when dealing with non-smooth regularizers, statistical heterogeneity, and the restrictions of biased compression. To address these issues, we propose FedCEF (Federated Composite Error Feedback), a novel algorithm tailored for non-convex FCO. FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information. To mitigate the noise from aggressive quantization and the bias from non-IID data, FedCEF integrates a rigorous error feedback mechanism with control variates. Furthermore, we design a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission. We theoretically establish that FedCEF achieves sublinear convergence to a bounded residual error under general non-convexity, which is controllable via the step size and batch size. Extensive experiments on real datasets validate FedCEF maintains competitive model accuracy even under extreme compression ratios (e.g., 1%), significantly reducing the total communication volume compared to uncompressed baselines.

Compressed Proximal Federated Learning for Non-Convex Composite Optimization on Heterogeneous Data

TL;DR

FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information, and designs a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission.

Abstract

Federated Composite Optimization (FCO) has emerged as a promising framework for training models with structural constraints (e.g., sparsity) in distributed edge networks. However, simultaneously achieving communication efficiency and convergence robustness remains a significant challenge, particularly when dealing with non-smooth regularizers, statistical heterogeneity, and the restrictions of biased compression. To address these issues, we propose FedCEF (Federated Composite Error Feedback), a novel algorithm tailored for non-convex FCO. FedCEF introduces a decoupled proximal update scheme that separates the proximal operator from communication, enabling clients to handle non-smooth terms locally while transmitting compressed information. To mitigate the noise from aggressive quantization and the bias from non-IID data, FedCEF integrates a rigorous error feedback mechanism with control variates. Furthermore, we design a communication-efficient pre-proximal downlink strategy that allows clients to exactly reconstruct global control variables without explicit transmission. We theoretically establish that FedCEF achieves sublinear convergence to a bounded residual error under general non-convexity, which is controllable via the step size and batch size. Extensive experiments on real datasets validate FedCEF maintains competitive model accuracy even under extreme compression ratios (e.g., 1%), significantly reducing the total communication volume compared to uncompressed baselines.
Paper Structure (22 sections, 6 theorems, 63 equations, 4 figures, 1 algorithm)

This paper contains 22 sections, 6 theorems, 63 equations, 4 figures, 1 algorithm.

Key Result

Lemma 1

Suppose that Assumptions 1--4 hold, then, we have

Figures (4)

  • Figure 1: Test Accuracy vs. Communication Cost on CIFAR-10. FedCEF ($r=0.01$) achieves target accuracy with minimal bandwidth consumption.
  • Figure 2: Train Loss vs. Communication Cost on CIFAR-10. FedCEF demonstrates superior communication efficiency in loss reduction.
  • Figure 3: Test Accuracy vs. Communication Cost on MNIST.
  • Figure 4: Train Loss vs. Communication Cost on MNIST.

Theorems & Definitions (9)

  • Definition 1: Contractive Compressor
  • Remark 1: Comparison with Existing Assumptions
  • Lemma 1: One-step Descent
  • Lemma 2: Local Estimation Recursion
  • Lemma 3: Global Estimation Recursion
  • Lemma 4: Compression Error Recursion
  • Lemma 5: Local Client Drift
  • Theorem 1: Convergence Analysis
  • Remark 2: Analysis of Convergence and Residual Error