Understanding the ADMM Algorithm via High-Resolution Differential Equations

Bowen Li; Bin Shi

Understanding the ADMM Algorithm via High-Resolution Differential Equations

Bowen Li, Bin Shi

TL;DR

This work addresses understanding the iterative behavior of ADMM in distributed convex optimization by deriving a system of high-resolution ODEs that incorporate the λ-correction, a small perturbation causing trajectories to depart from the constraint hyperplane. Through Lyapunov analysis, the authors connect the continuous high-resolution dynamics to the discrete ADMM, identifying the numerical error from the implicit discretization as a key factor shaping convergence rate and monotonicity. They prove time-average convergence with rate $O(1/t)$ for general convex objectives and obtain an $O(1/(N+1))$ rate for the averaged iterate under strong convexity, with similar results extended to a general ADMM form via a stabilizing first-step modification. The findings offer a principled explanation of ADMM dynamics in distributed settings and guide potential enhancements and connections to related primal-dual methods like PDHG.

Abstract

In the fields of statistics, machine learning, image science, and related areas, there is an increasing demand for decentralized collection or storage of large-scale datasets, as well as distributed solution methods. To tackle this challenge, the alternating direction method of multipliers (ADMM) has emerged as a widely used approach, particularly well-suited to distributed convex optimization. However, the iterative behavior of ADMM has not been well understood. In this paper, we employ dimensional analysis to derive a system of high-resolution ordinary differential equations (ODEs) for ADMM. This system captures an important characteristic of ADMM, called the $λ$-correction, which causes the trajectory of ADMM to deviate from the constrained hyperplane. To explore the convergence behavior of the system of high-resolution ODEs, we utilize Lyapunov analysis and extend our findings to the discrete ADMM algorithm. Through this analysis, we identify that the numerical error resulting from the implicit scheme is a crucial factor that affects the convergence rate and monotonicity in the discrete ADMM algorithm. In addition, we further discover that if one component of the objective function is assumed to be strongly convex, the iterative average of ADMM converges strongly with a rate $O(1/N)$, where $N$ is the number of iterations.

Understanding the ADMM Algorithm via High-Resolution Differential Equations

TL;DR

for general convex objectives and obtain an

rate for the averaged iterate under strong convexity, with similar results extended to a general ADMM form via a stabilizing first-step modification. The findings offer a principled explanation of ADMM dynamics in distributed settings and guide potential enhancements and connections to related primal-dual methods like PDHG.

Abstract

-correction, which causes the trajectory of ADMM to deviate from the constrained hyperplane. To explore the convergence behavior of the system of high-resolution ODEs, we utilize Lyapunov analysis and extend our findings to the discrete ADMM algorithm. Through this analysis, we identify that the numerical error resulting from the implicit scheme is a crucial factor that affects the convergence rate and monotonicity in the discrete ADMM algorithm. In addition, we further discover that if one component of the objective function is assumed to be strongly convex, the iterative average of ADMM converges strongly with a rate

, where

is the number of iterations.

Paper Structure (12 sections, 16 theorems, 78 equations, 2 figures)

This paper contains 12 sections, 16 theorems, 78 equations, 2 figures.

Introduction
$\lambda$-correction: a small but essential perturbation
Numerical errors: byproduct of the implicit scheme
Overview of contributions
Preliminaries
Perspective from the system of high-resolution ODEs
Derivation of the system of high-resolution ODEs
Convergence of the system of high-resolution ODEs
Convergence rates of ADMM
Monotonicity
A general form of ADMM
Conclusion and discussion

Key Result

Theorem 2.3

Let $f \in \mathcal{F}^{0}(\mathbb{R}^d)$. Then for any $x \in \mathbb{R}^{d}$, the subdifferential $\partial f(x)$ is nonempty.

Figures (2)

Figure 1: A schematic diagram of the trajectory of ADMM with any initial $(x_0, y_0)$ (Black).
Figure 2: A schematic diagram of the trajectory of ADMM with any initial $(x_0, y_0)$ (Black) and the continuous limit ODE with any initial $x_0$ (Blue), where the initial $y_0'$ is not arbitrary and required to satisfies $Fx_0 +Gy_0' = h$.

Theorems & Definitions (22)

Definition 2.1
Definition 2.2
Theorem 2.3
Theorem 2.4
Theorem 2.5: Theorem 23.8 in rockafellar1970convex
Theorem 2.6: Theorem 25.1 in rockafellar1970convex
Definition 2.7
Theorem 2.8: Theorem 11.50 in rockafellar2009variational
Lemma 3.1
proof : Proof of \ref{['lem: ode']}
...and 12 more

Understanding the ADMM Algorithm via High-Resolution Differential Equations

TL;DR

Abstract

Understanding the ADMM Algorithm via High-Resolution Differential Equations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (22)