Table of Contents
Fetching ...

DualFL: A Duality-based Federated Learning Algorithm with Communication Acceleration in the General Convex Regime

Jongho Park, Jinchao Xu

TL;DR

DualFL addresses the challenge of achieving communication acceleration in federated learning for general convex objectives, including nonsmooth and non-strongly convex costs. It leverages a Fenchel-Rockafellar duality-based reformulation and an accelerated forward-backward (inexact FISTA) scheme, realized deterministically through a predualization step to produce primal updates. Theoretical results establish convergence in both nonsmooth strongly convex and smooth strongly convex regimes, with optimal or near-optimal communication complexities, and extend to non-strongly convex problems via $\ell^2$ regularization and epi-convergence arguments. Numerical experiments on MNIST and CIFAR-10 demonstrate faster energy decay and robustness to the number of clients, validating the practical efficacy of the duality-based approach compared to existing federated learning methods.

Abstract

We propose a new training algorithm, named DualFL (Dualized Federated Learning), for solving distributed optimization problems in federated learning. DualFL achieves communication acceleration for very general convex cost functions, thereby providing a solution to an open theoretical problem in federated learning concerning cost functions that may not be smooth nor strongly convex. We provide a detailed analysis for the local iteration complexity of DualFL to ensure the overall computational efficiency of DualFL. Furthermore, we introduce a completely new approach for the convergence analysis of federated learning based on a dual formulation. This new technique enables concise and elegant analysis, which contrasts the complex calculations used in existing literature on convergence of federated learning algorithms.

DualFL: A Duality-based Federated Learning Algorithm with Communication Acceleration in the General Convex Regime

TL;DR

DualFL addresses the challenge of achieving communication acceleration in federated learning for general convex objectives, including nonsmooth and non-strongly convex costs. It leverages a Fenchel-Rockafellar duality-based reformulation and an accelerated forward-backward (inexact FISTA) scheme, realized deterministically through a predualization step to produce primal updates. Theoretical results establish convergence in both nonsmooth strongly convex and smooth strongly convex regimes, with optimal or near-optimal communication complexities, and extend to non-strongly convex problems via regularization and epi-convergence arguments. Numerical experiments on MNIST and CIFAR-10 demonstrate faster energy decay and robustness to the number of clients, validating the practical efficacy of the duality-based approach compared to existing federated learning methods.

Abstract

We propose a new training algorithm, named DualFL (Dualized Federated Learning), for solving distributed optimization problems in federated learning. DualFL achieves communication acceleration for very general convex cost functions, thereby providing a solution to an open theoretical problem in federated learning concerning cost functions that may not be smooth nor strongly convex. We provide a detailed analysis for the local iteration complexity of DualFL to ensure the overall computational efficiency of DualFL. Furthermore, we introduce a completely new approach for the convergence analysis of federated learning based on a dual formulation. This new technique enables concise and elegant analysis, which contrasts the complex calculations used in existing literature on convergence of federated learning algorithms.
Paper Structure (16 sections, 12 theorems, 57 equations, 1 figure, 2 tables, 2 algorithms)

This paper contains 16 sections, 12 theorems, 57 equations, 1 figure, 2 tables, 2 algorithms.

Key Result

Proposition 3.1

\newlabelProp:local_dual0 Suppose that each $f_j$, $1 \leq j \leq N$, in FL is $\mu$-strongly convex for some $\mu > 0$. For a positive constant $\nu \in (0, \mu]$, if $\theta_j \in \Omega$ solves local_primal, then $\xi_j = \nu (\zeta_j^{(n)} - \theta_j) \in \Omega$ solves where $g_j (\theta) = f_j (\theta) - \frac{\nu}{2} \| \theta \|^2$. Moreover, we have

Figures (1)

  • Figure 1: Relative energy error $\frac{E(\theta) - E(\theta^*)}{E(\theta^*)}$ with respect to the number of communication rounds in various training algorithms for multinomial logistic regression on the (a--c) MNIST and (d--f) CIFAR-10 training dataset. (a, d) Comparison of DualFL with benchmark algorithms. (b, e) Convergence of DualFL when the number of clients $N$ changes. (c, f) Convergence of DualFL when the value of the hyperparameter $\rho$ changes.

Theorems & Definitions (18)

  • Proposition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Theorem 4.1
  • Proof 1
  • Theorem 4.2
  • Proof 2
  • Proposition 6.1: Fenchel--Rockafellar duality
  • ...and 8 more