Table of Contents
Fetching ...

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

Shucheng Kang, Heng Yang

TL;DR

This paper tackles slow convergence of ADMM when solving large-scale semidefinite programs with multiple KKT points by developing a local, second-order limit-dynamics framework around an arbitrary KKT point. Central to the approach is a refined parabolic second-order directional derivative of the PSD projection, from which a local limit map is derived to describe persistent drift after transients are filtered out; the analysis reveals a primal–dual decoupling and a cone of directions where first-order updates vanish. The authors establish structural properties of the limit map (kernel, range, continuity) and show how the penalty parameter $\sigma$ influences the dynamics, connecting these findings to fixed points, almost-invariant sets, and microscopic phases. Numerical experiments on Mittelmann SDP instances corroborate the theory, explaining phenomena such as small nonzero angles between iterate differences, infeasibility metrics that are insensitive to $\sigma$, and transient confinement to a low-dimensional subspace, thereby providing practical guidance for algorithm design and parameter tuning. The work offers a physics-informed lens on region-wise transient behavior in first-order methods for SDPs and lays groundwork for accelerated strategies leveraging second-order limit dynamics.

Abstract

The alternating direction method of multipliers (ADMM) is widely used for solving large-scale semidefinite programs (SDPs), yet on instances with multiple primal--dual optimal solution pairs, it often enters prolonged slow-convergence regions where the Karush--Kuhn--Tucker (KKT) residuals nearly stall. To explain and predict the fine-grained dynamical behavior inside these regions, we develop a local second-order limit dynamics framework for ADMM near an arbitrary KKT point -- not necessarily the eventual limit point of the iterates. Assuming the existence of a strictly complementary primal--dual solution pair, we derive a second-order local expansion of the ADMM dynamics by leveraging a refined and simplified variational characterization of the (parabolic) second-order directional derivative of the PSD projection operator. This expansion reveals a closed convex cone of directions along which the local first-order update vanishes, and it induces a second-order limit map that governs the persistent drift after transient effects are filtered out. We characterize fundamental properties of this mapping, including its kernel, range, and continuity. A primal--dual decoupling further yields a clean scaling law for the effect of the penalty parameter in ADMM. We connect these properties to second-order dynamical features of ADMM, including fixed points, almost-invariant sets, and microscopic phases. Three empirical phenomena in slow-convergence regions are then explained or predicted: (i) angles between consecutive iterate differences are small yet nonzero, except for sparse spikes; (ii) primal and dual infeasibilities are insensitive to penalty-parameter updates; and (iii) iterates can be transiently trapped in a low-dimensional subspace for an extended period. Extensive numerical experiments on the Mittelmann dataset corroborate our theoretical predictions.

Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming

TL;DR

This paper tackles slow convergence of ADMM when solving large-scale semidefinite programs with multiple KKT points by developing a local, second-order limit-dynamics framework around an arbitrary KKT point. Central to the approach is a refined parabolic second-order directional derivative of the PSD projection, from which a local limit map is derived to describe persistent drift after transients are filtered out; the analysis reveals a primal–dual decoupling and a cone of directions where first-order updates vanish. The authors establish structural properties of the limit map (kernel, range, continuity) and show how the penalty parameter influences the dynamics, connecting these findings to fixed points, almost-invariant sets, and microscopic phases. Numerical experiments on Mittelmann SDP instances corroborate the theory, explaining phenomena such as small nonzero angles between iterate differences, infeasibility metrics that are insensitive to , and transient confinement to a low-dimensional subspace, thereby providing practical guidance for algorithm design and parameter tuning. The work offers a physics-informed lens on region-wise transient behavior in first-order methods for SDPs and lays groundwork for accelerated strategies leveraging second-order limit dynamics.

Abstract

The alternating direction method of multipliers (ADMM) is widely used for solving large-scale semidefinite programs (SDPs), yet on instances with multiple primal--dual optimal solution pairs, it often enters prolonged slow-convergence regions where the Karush--Kuhn--Tucker (KKT) residuals nearly stall. To explain and predict the fine-grained dynamical behavior inside these regions, we develop a local second-order limit dynamics framework for ADMM near an arbitrary KKT point -- not necessarily the eventual limit point of the iterates. Assuming the existence of a strictly complementary primal--dual solution pair, we derive a second-order local expansion of the ADMM dynamics by leveraging a refined and simplified variational characterization of the (parabolic) second-order directional derivative of the PSD projection operator. This expansion reveals a closed convex cone of directions along which the local first-order update vanishes, and it induces a second-order limit map that governs the persistent drift after transient effects are filtered out. We characterize fundamental properties of this mapping, including its kernel, range, and continuity. A primal--dual decoupling further yields a clean scaling law for the effect of the penalty parameter in ADMM. We connect these properties to second-order dynamical features of ADMM, including fixed points, almost-invariant sets, and microscopic phases. Three empirical phenomena in slow-convergence regions are then explained or predicted: (i) angles between consecutive iterate differences are small yet nonzero, except for sparse spikes; (ii) primal and dual infeasibilities are insensitive to penalty-parameter updates; and (iii) iterates can be transiently trapped in a low-dimensional subspace for an extended period. Extensive numerical experiments on the Mittelmann dataset corroborate our theoretical predictions.
Paper Structure (89 sections, 36 theorems, 286 equations, 15 figures)

This paper contains 89 sections, 36 theorems, 286 equations, 15 figures.

Key Result

Theorem 1

Let $Z \in \mathbb{S}^{n}$ be given by the first-level description. For any $H \in \mathbb{S}^{n}$ given by the second-level description, For a non-diagonal $Z \in \mathbb{S}^{n}$: Pick $Q \in {\cal O}^n(Z)$. Denote $\IfNoValueTF{-NoValue-} {\widetilde{Z}} {\IfNoValueTF{-NoValue-} {\widetilde{Z}_{{\alpha_{-NoValue-}}}} {\widetilde{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } :

Figures (15)

  • Figure 1: Trajectories of $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _{\max}$, $\lVert \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, and $\angle(\Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } , \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k+1)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$ in Experiment I.
  • Figure 2: Trajectories of $\lVert \Delta \IfNoValueTF{-NoValue-} {{X}^{(k)}} {\IfNoValueTF{-NoValue-} {{X}^{(k)}_{{\alpha_{-NoValue-}}}} {{X}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, $\lVert \Delta \IfNoValueTF{-NoValue-} {{S}^{(k)}} {\IfNoValueTF{-NoValue-} {{S}^{(k)}_{{\alpha_{-NoValue-}}}} {{S}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F}$, $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _p$, and $\IfNoValueTF{-NoValue-} {{r}^{(k)}} {\IfNoValueTF{-NoValue-} {{r}^{(k)}_{{\alpha_{-NoValue-}}}} {{r}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } _d$ w.r.t $\sigma$ in Experiment II.
  • Figure 3: Illustration of the local second-order limit dynamics of ADMM for SDPs. The spectrahedron represents the optimal solution set ${\cal Z}_\star$. The blue cone depicts ${\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the cone of directions along which ADMM's local first-order update vanishes. The purple cone depicts ${\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the tangent cone to ${\cal Z}_\star$ attached at $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$. In the left panel, the green points and flow indicate the transient local first-order dynamics, which vanishes as $k\to\infty$ and converges to ${\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$. The red points and wavy trajectories illustrate the transient local second-order dynamics. For each point of the form $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$ with a stalled first-order direction $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, the second-order iterate difference converges to $\frac{t^2}{2} \IfNoValueTF{-NoValue-} {\phi( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } ; \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )} {\phi(-NoValue-)}$ (red arrows in the right panel), capturing ADMM's limiting behavior up to second order.
  • Figure 4: $\log_{10}(\lVert \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \rVert_\mathsf{F})$ in three SDP examples. In each example, the initialization is chosen as $Z^{(0)}= \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$, where $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal Z}_\star$ and $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )\backslash{\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, and we sweep $t$ from $10^{-1}$ to $10^{-5}$. $\sigma$ is fixed to $1$.
  • Figure 5: $\log_{10}(\angle (\Delta \IfNoValueTF{-NoValue-} {{Z}^{(k)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } , \Delta \IfNoValueTF{-NoValue-} {{Z}^{(k+1)}} {\IfNoValueTF{-NoValue-} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}}}} {{Z}^{(k+1)}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } ))$ in three SDP examples. In each example, the initialization is chosen as $Z^{(0)}= \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } +t \IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} }$, where $\IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal Z}_\star$ and $\IfNoValueTF{-NoValue-} {\widebar{H}} {\IfNoValueTF{-NoValue-} {\widebar{H}_{{\alpha_{-NoValue-}}}} {\widebar{H}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } \in{\cal C}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )\backslash{\cal T}_{{\cal Z}_\star}( \IfNoValueTF{-NoValue-} {\widebar{Z}} {\IfNoValueTF{-NoValue-} {\widebar{Z}_{{\alpha_{-NoValue-}}}} {\widebar{Z}_{{\alpha_{-NoValue-}} {\alpha_{-NoValue-}}}} } )$, and we sweep $t$ from $10^{-1}$ to $10^{-5}$. $\sigma$ is fixed to $1$.
  • ...and 10 more figures

Theorems & Definitions (73)

  • Theorem 1: $\Pi_{\mathbb{S}^{n}_+}'(Z; H)$
  • Theorem 2: $\Pi_{\mathbb{S}^{n}_+}"(Z; H, W)$
  • Theorem 3: $\Pi_{\mathbb{S}^{n}_-}'(Z; H)$
  • proof
  • Theorem 4: $\Pi_{\mathbb{S}^{n}_-}"(Z; H, W)$
  • proof
  • Proposition 1: ${\cal X}_\star$ and ${\cal S}_\star$
  • proof
  • Remark 1
  • Definition 1: Local first- and second-order dynamics
  • ...and 63 more