Federated Composite Saddle Point Optimization

Site Bai; Brian Bullins

Federated Composite Saddle Point Optimization

Site Bai, Brian Bullins

TL;DR

This work tackles the challenge of federated learning for composite saddle point problems (SPP), where constraints or non-smooth regularization are present. It introduces FeDualEx, an extra-step primal-dual algorithm that leverages generalized Bregman divergence and dual aggregation to handle non-smooth regularization within FL, and it provides the first convergence rate results for federated composite SPP under homogeneous data. The sequential variants extend the approach to stochastic and deterministic composite optimization, yielding $O(1/\sqrt{T})$ and $O(1/T)$ rates respectively, broadening the theoretical foundation beyond the FL setting. Empirically, FeDualEx outperforms baselines on tasks with sparsity and low-rank regularization, demonstrating practical impact for constrained or regularized federated SPP in real-world ML problems.

Abstract

Federated learning (FL) approaches for saddle point problems (SPP) have recently gained in popularity due to the critical role they play in machine learning (ML). Existing works mostly target smooth unconstrained objectives in Euclidean space, whereas ML problems often involve constraints or non-smooth regularization, which results in a need for composite optimization. Addressing these issues, we propose Federated Dual Extrapolation (FeDualEx), an extra-step primal-dual algorithm, which is the first of its kind that encompasses both saddle point optimization and composite objectives under the FL paradigm. Both the convergence analysis and the empirical evaluation demonstrate the effectiveness of FeDualEx in these challenging settings. In addition, even for the sequential version of FeDualEx, we provide rates for the stochastic composite saddle point setting which, to our knowledge, are not found in prior literature.

Federated Composite Saddle Point Optimization

TL;DR

and

rates respectively, broadening the theoretical foundation beyond the FL setting. Empirically, FeDualEx outperforms baselines on tasks with sparsity and low-rank regularization, demonstrating practical impact for constrained or regularized federated SPP in real-world ML problems.

Abstract

Paper Structure (44 sections, 28 theorems, 140 equations, 6 figures, 2 tables, 5 algorithms)

This paper contains 44 sections, 28 theorems, 140 equations, 6 figures, 2 tables, 5 algorithms.

Introduction
Related Work
Preliminaries and Definitions
Composite Saddle Point Optimization
Mirror Prox and Dual Extrapolation
Generalized Bregman Divergence
Federated Learning
Federated Dual Extrapolation (FeDualEx)
The FeDualEx Algorithm
Convergence Analysis of FeDualEx
FeDualEx in Sequential Settings
Stochastic Composite Saddle Point Optimization
Deterministic Composite Saddle Point Optimization
Experiments
Saddle Point Problem with Sparsity Regularization and Ball Constraint
...and 29 more sections

Key Result

Theorem 1

Under Assumptions, the duality gap evaluated with the ergodic sequence generated by the intermediate steps of FeDualEx in Algorithm alg:fed-DualEx is bounded by Choosing step size $\eta^c = \min \{\frac{1}{5\beta^2}, \frac{B^\frac{1}{4}}{20^\frac{1}{4}\beta^\frac{1}{2}G^\frac{1}{2}K^\frac{3}{4}R^\frac{1}{4}}, \frac{B^\frac{1}{2}M^\frac{1}{2}}{5^\frac{1}{2}\sigma R^\frac{1}{2} K^\frac{1}{2}}, \fra

Figures (6)

Figure 1: Dual Extrapolation.
Figure 2: The composite saddle point optimization problem with $\ell_1$ norm sparsity regularization from jiang2022generalized, and the evaluation of its duality gap given in the closed-form.
Figure 3: The composite saddle point optimization problem with nuclear norm low-rank regularization, and the evaluation of its duality gap given in the closed-form.
Figure 4: Duality gap and sparsity of the solution to the SPP in Figure \ref{['fig:saddle-problem']}.
Figure 5: Duality gap and rank of the solution to the nuclear norm regularized SPP in Figure \ref{['fig:saddle-problem2']}.
...and 1 more figures

Theorems & Definitions (57)

Definition 1: Composite SPP
Definition 2: Generalized Bregman Divergence flammarion2017stochastic
Definition 3: Generalized Bregman Divergence for Saddle Functions
Definition 4: Generalized Proximal Operator for Saddle Functions
Theorem 1: Main
Lemma 1: Bounding the Regularization Term
Lemma 2: Bounding the Smooth Term
Theorem 2
Theorem 3
Theorem 4
...and 47 more

Federated Composite Saddle Point Optimization

TL;DR

Abstract

Federated Composite Saddle Point Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (57)