Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

Shucheng Kang; Heng Yang

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

Shucheng Kang, Heng Yang

TL;DR

This paper tackles slow convergence of ADMM when solving large-scale semidefinite programs with multiple KKT points by developing a local, second-order limit-dynamics framework around an arbitrary KKT point. Central to the approach is a refined parabolic second-order directional derivative of the PSD projection, from which a local limit map is derived to describe persistent drift after transients are filtered out; the analysis reveals a primal–dual decoupling and a cone of directions where first-order updates vanish. The authors establish structural properties of the limit map (kernel, range, continuity) and show how the penalty parameter $\sigma$ influences the dynamics, connecting these findings to fixed points, almost-invariant sets, and microscopic phases. Numerical experiments on Mittelmann SDP instances corroborate the theory, explaining phenomena such as small nonzero angles between iterate differences, infeasibility metrics that are insensitive to $\sigma$, and transient confinement to a low-dimensional subspace, thereby providing practical guidance for algorithm design and parameter tuning. The work offers a physics-informed lens on region-wise transient behavior in first-order methods for SDPs and lays groundwork for accelerated strategies leveraging second-order limit dynamics.

Abstract

The alternating direction method of multipliers (ADMM) is widely used for solving large-scale semidefinite programs (SDPs), yet on instances with multiple primal--dual optimal solution pairs, it often enters prolonged slow-convergence regions where the Karush--Kuhn--Tucker (KKT) residuals nearly stall. To explain and predict the fine-grained dynamical behavior inside these regions, we develop a local second-order limit dynamics framework for ADMM near an arbitrary KKT point -- not necessarily the eventual limit point of the iterates. Assuming the existence of a strictly complementary primal--dual solution pair, we derive a second-order local expansion of the ADMM dynamics by leveraging a refined and simplified variational characterization of the (parabolic) second-order directional derivative of the PSD projection operator. This expansion reveals a closed convex cone of directions along which the local first-order update vanishes, and it induces a second-order limit map that governs the persistent drift after transient effects are filtered out. We characterize fundamental properties of this mapping, including its kernel, range, and continuity. A primal--dual decoupling further yields a clean scaling law for the effect of the penalty parameter in ADMM. We connect these properties to second-order dynamical features of ADMM, including fixed points, almost-invariant sets, and microscopic phases. Three empirical phenomena in slow-convergence regions are then explained or predicted: (i) angles between consecutive iterate differences are small yet nonzero, except for sparse spikes; (ii) primal and dual infeasibilities are insensitive to penalty-parameter updates; and (iii) iterates can be transiently trapped in a low-dimensional subspace for an extended period. Extensive numerical experiments on the Mittelmann dataset corroborate our theoretical predictions.

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

TL;DR

influences the dynamics, connecting these findings to fixed points, almost-invariant sets, and microscopic phases. Numerical experiments on Mittelmann SDP instances corroborate the theory, explaining phenomena such as small nonzero angles between iterate differences, infeasibility metrics that are insensitive to

, and transient confinement to a low-dimensional subspace, thereby providing practical guidance for algorithm design and parameter tuning. The work offers a physics-informed lens on region-wise transient behavior in first-order methods for SDPs and lays groundwork for accelerated strategies leveraging second-order limit dynamics.

Abstract

Paper Structure (89 sections, 36 theorems, 286 equations, 15 figures)

This paper contains 89 sections, 36 theorems, 286 equations, 15 figures.

Introduction
ADMM for SDP.
One-dimensional criteria.
ADMM for SDP: Empirical Slow-Convergence Patterns
Empirical patterns in slow-convergence regions.
Experiment I.
Experiment II.
Contributions
A refined and simplified formula for the second-order directional derivative of $\Pi_{\mathbb{S}^{n}_+}(\cdot)$.
A local second-order limiting model for ADMM near any $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in {\cal Z}_\star$.
Numerical verification.
Limitations
Scope and interpretation.
Notation
Outline
...and 74 more sections

Key Result

Theorem 1

Let $Z \in \mathbb{S}^{n}$ be given by the first-level description. For any $H \in \mathbb{S}^{n}$ given by the second-level description, For a non-diagonal $Z \in \mathbb{S}^{n}$: Pick $Q \in {\cal O}^n(Z)$. Denote $\IfNoValueTF{-NoValue-} {\widetilde{Z}} {\IfNoValueTF{-NoValue-} {\widetilde{Z}_{{\alpha_{-NoValue-}}}} {\widetilde{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } :

Figures (15)

Figure 1: Trajectories of $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _{\max}$, $\lVert \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, and $\angle(\Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } , \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k+1)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$ in Experiment I.
Figure 2: Trajectories of $\lVert \Delta \IfNoValueTF{-NoValue-} {{X}^{(k)}} {\IfNoValueTF{-NoValue-} {{X}^{(k)}_{{\alpha_{-NoValue-}}}} {{X}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, $\lVert \Delta \IfNoValueTF{-NoValue-} {{S}^{(k)}} {\IfNoValueTF{-NoValue-} {{S}^{(k)}_{{\alpha_{-NoValue-}}}} {{S}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _p$, and $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _d$ w.r.t $\sigma$ in Experiment II.
Figure 3: Illustration of the local second-order limit dynamics of ADMM for SDPs. The spectrahedron represents the optimal solution set ${\cal Z}_\star$. The blue cone depicts ${\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the cone of directions along which ADMM's local first-order update vanishes. The purple cone depicts ${\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the tangent cone to ${\cal Z}_\star$ attached at $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$. In the left panel, the green points and flow indicate the transient local first-order dynamics, which vanishes as $k\to\infty$ and converges to ${\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$. The red points and wavy trajectories illustrate the transient local second-order dynamics. For each point of the form $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$ with a stalled first-order direction $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the second-order iterate difference converges to $\frac{t^2}{2} \IfNoValueTF{-NoValue-} {\phi( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } ; \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )} {\phi(-NoValue-)}$ (red arrows in the right panel), capturing ADMM's limiting behavior up to second order.
Figure 4: $\log_{10}(\lVert \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F})$ in three SDP examples. In each example, the initialization is chosen as $Z^{(0)}= \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$, where $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal Z}_\star$ and $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )\backslash{\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, and we sweep $t$ from $10^{-1}$ to $10^{-5}$. $\sigma$ is fixed to $1$.
Figure 5: $\log_{10}(\angle (\Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } , \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k+1)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } ))$ in three SDP examples. In each example, the initialization is chosen as $Z^{(0)}= \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$, where $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal Z}_\star$ and $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )\backslash{\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, and we sweep $t$ from $10^{-1}$ to $10^{-5}$. $\sigma$ is fixed to $1$.
...and 10 more figures

Theorems & Definitions (73)

Theorem 1: $\Pi_{\mathbb{S}^{n}_+}'(Z; H)$
Theorem 2: $\Pi_{\mathbb{S}^{n}_+}"(Z; H, W)$
Theorem 3: $\Pi_{\mathbb{S}^{n}_-}'(Z; H)$
proof
Theorem 4: $\Pi_{\mathbb{S}^{n}_-}"(Z; H, W)$
proof
Proposition 1: ${\cal X}_\star$ and ${\cal S}_\star$
proof
Remark 1
Definition 1: Local first- and second-order dynamics
...and 63 more

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

TL;DR

Abstract

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (73)