Table of Contents
Fetching ...

A distributed proximal splitting method with linesearch for locally Lipschitz gradients

Felipe Atenas, Minh N. Dao, Matthew K. Tam

TL;DR

This work develops a distributed first-order method for multi-agent optimization with nonsmooth and smooth components, where the smooth parts have locally Lipschitz gradients and global Lipschitz constants may be unavailable. By embedding a primal-dual proximal-gradient scheme inside an abstract backtracking linesearch, it decouples the consensus constraint via a dual formulation and achieves convergence without global Lipschitz constants. The authors introduce two distributed linesearch schemes—one using a scalar sum and another using a global minimum—to compute adaptive step sizes with only local communication, and prove convergence to saddle points with an ergodic rate. Numerical experiments on distributed Poisson regression and information-matrix estimation illustrate robustness and improved consensus behavior over fixed-step methods, highlighting practical applicability to problems with locally Lipschitz gradients.

Abstract

In this paper, we propose a distributed first-order algorithm with backtracking linesearch for solving multi-agent minimisation problems, where each agent handles a local objective involving nonsmooth and smooth components. Unlike existing methods that require global Lipschitz continuity and predefined stepsizes, our algorithm adjusts stepsizes using distributed linesearch procedures, making it suitable for problems where global constants are unavailable or difficult to compute. The proposed algorithm is designed within an abstract linesearch framework for a primal-dual proximal-gradient method to solve min-max convex-concave problems, enabling the consensus constraint to be decoupled from the optimisation task. Our theoretical analysis allows for gradients of functions to be locally Lipschitz continuous, relaxing the prevalent assumption of globally Lipschitz continuous gradients.

A distributed proximal splitting method with linesearch for locally Lipschitz gradients

TL;DR

This work develops a distributed first-order method for multi-agent optimization with nonsmooth and smooth components, where the smooth parts have locally Lipschitz gradients and global Lipschitz constants may be unavailable. By embedding a primal-dual proximal-gradient scheme inside an abstract backtracking linesearch, it decouples the consensus constraint via a dual formulation and achieves convergence without global Lipschitz constants. The authors introduce two distributed linesearch schemes—one using a scalar sum and another using a global minimum—to compute adaptive step sizes with only local communication, and prove convergence to saddle points with an ergodic rate. Numerical experiments on distributed Poisson regression and information-matrix estimation illustrate robustness and improved consensus behavior over fixed-step methods, highlighting practical applicability to problems with locally Lipschitz gradients.

Abstract

In this paper, we propose a distributed first-order algorithm with backtracking linesearch for solving multi-agent minimisation problems, where each agent handles a local objective involving nonsmooth and smooth components. Unlike existing methods that require global Lipschitz continuity and predefined stepsizes, our algorithm adjusts stepsizes using distributed linesearch procedures, making it suitable for problems where global constants are unavailable or difficult to compute. The proposed algorithm is designed within an abstract linesearch framework for a primal-dual proximal-gradient method to solve min-max convex-concave problems, enabling the consensus constraint to be decoupled from the optimisation task. Our theoretical analysis allows for gradients of functions to be locally Lipschitz continuous, relaxing the prevalent assumption of globally Lipschitz continuous gradients.

Paper Structure

This paper contains 19 sections, 8 theorems, 87 equations, 8 figures, 5 algorithms.

Key Result

Lemma 2.1

Suppose $\phi: C \to \mathbbm R$ is a differentiable function over an open convex set $C \subseteq \mathcal{U}$, such that $\nabla \phi$ is Lipschitz continuous on $C$ with constant $L>0$. Then, for all $x,y \in C$,

Figures (8)

  • Figure 1: Reconstruction of a simple image from distributed blurry and noisy measurements. Each row corresponds to an agent, and each column to a method. From left to right: the unique original image, PG-EXTRA shi2015proximal with constant stepsize $\tau = 0.01$, Algorithm \ref{['a:PGE']} with constant stepsize $\tau_0 = 0.1\cdot\frac{\sqrt{2\delta_K}}{\sqrt{\beta(1 - \lambda_{\min}(W))}}$, Algorithm \ref{['a:PGE']} implemented with Subroutine \ref{['LS:1']} starting from $\tau = \tau_0$, and Algorithm \ref{['a:PGE']} implemented with Subroutine \ref{['LS:2']} starting from $\tau = \tau_0$.
  • Figure 2: Reconstruction of a synthetic phantom image from distributed blurry and noisy measurements. Each row corresponds to an agent, and each column to a method. From left to right: original image, PG-EXTRA shi2015proximal with constant stepsize $\tau = 0.01$, Algorithm \ref{['a:PGE']} with constant stepsize $\tau_0 = 0.1\cdot\frac{\sqrt{2\delta_K}}{\sqrt{\beta(1 - \lambda_{\min}(W))}}$, Algorithm \ref{['a:PGE']} implemented with Subroutine \ref{['LS:1']} starting from $\tau = \tau_0$, and Algorithm \ref{['a:PGE']} implemented with Subroutine \ref{['LS:2']} starting from $\tau = \tau_0$.
  • Figure 3: Relative error $\Delta \mathbf{x}^k = \frac{\|\mathbf{x}^{k+1}-\mathbf{x}^k\|}{\|\mathbf{x}^0 - \mathbf{x}_{\text{true}}\|}$, where $\mathbf{x}_{\text{true}}$ is the original image.
  • Figure 4: Feasibility measure $\|(I-W)\mathbf{x}^k\|$ along outer iterations.
  • Figure 5: Stepsize sequence alongside outer iterations.
  • ...and 3 more figures

Theorems & Definitions (22)

  • Lemma 2.1: Descent lemma
  • Lemma 2.2: friedlander2023perspective
  • Definition 2.3: Mixing matrix
  • Lemma 3.2: Sufficient decrease and boundedness of iterates
  • proof
  • Theorem 3.4
  • proof
  • Remark 3.5
  • Remark 3.6
  • Lemma 3.7
  • ...and 12 more