Table of Contents
Fetching ...

Stochastic momentum ADMM for nonconvex and nonsmooth optimization with application to PnP algorithm

Kangkang Deng, Shuchang Zhang, Boyu Wang, Jiachen Jin, Juan Zhou, Hongxia Wang

TL;DR

This work develops SMADMM, a single-loop stochastic ADMM for nonconvex nonsmooth optimization under linear constraints in an online setting, achieving the optimal oracle complexity of $O(\epsilon^{-3/2})$ with $O(1)$ stochastic gradient evaluations per iteration (initial batch $m=O(\epsilon^{-1/2})$). The authors extend the method with dynamic step sizes and penalty parameters, proving that optimal complexity persists, and they further integrate plug-and-play priors to yield PnP-SMADMM with convergence guarantees. They provide rigorous convergence results for both constant-parameter and dynamically scheduled variants, and establish the PnP extension’s theoretical validity under mild assumptions. Empirical results on graph-guided binary classification, sparse-view CT reconstruction, and phase retrieval show that SMADMM and PnP-SMADMM outperform state-of-the-art online stochastic ADMM methods in accuracy and efficiency, supporting the practical impact of the approach.

Abstract

This paper proposes SMADMM, a single-loop Stochastic Momentum Alternating Direction Method of Multipliers for solving a class of nonconvex and nonsmooth composite optimization problems. SMADMM achieves the optimal oracle complexity of $\mathcal{O}(ε^{-3/2})$ in the online setting. Unlike previous stochastic ADMM algorithms that require large mini-batches or a double-loop structure, SMADMM uses only $\mathcal{O}(1)$ stochastic gradient evaluations per iteration and avoids costly restarts. To further improve practicality, we incorporate dynamic step sizes and penalty parameters, proving that SMADMM maintains its optimal complexity without the need for large initial batches. We also develop PnP-SMADMM by integrating plug-and-play priors, and establish its theoretical convergence under mild assumptions. Extensive experiments on classification, CT image reconstruction, and phase retrieval tasks demonstrate that our approach outperforms existing stochastic ADMM methods both in accuracy and efficiency, validating our theoretical results.

Stochastic momentum ADMM for nonconvex and nonsmooth optimization with application to PnP algorithm

TL;DR

This work develops SMADMM, a single-loop stochastic ADMM for nonconvex nonsmooth optimization under linear constraints in an online setting, achieving the optimal oracle complexity of with stochastic gradient evaluations per iteration (initial batch ). The authors extend the method with dynamic step sizes and penalty parameters, proving that optimal complexity persists, and they further integrate plug-and-play priors to yield PnP-SMADMM with convergence guarantees. They provide rigorous convergence results for both constant-parameter and dynamically scheduled variants, and establish the PnP extension’s theoretical validity under mild assumptions. Empirical results on graph-guided binary classification, sparse-view CT reconstruction, and phase retrieval show that SMADMM and PnP-SMADMM outperform state-of-the-art online stochastic ADMM methods in accuracy and efficiency, supporting the practical impact of the approach.

Abstract

This paper proposes SMADMM, a single-loop Stochastic Momentum Alternating Direction Method of Multipliers for solving a class of nonconvex and nonsmooth composite optimization problems. SMADMM achieves the optimal oracle complexity of in the online setting. Unlike previous stochastic ADMM algorithms that require large mini-batches or a double-loop structure, SMADMM uses only stochastic gradient evaluations per iteration and avoids costly restarts. To further improve practicality, we incorporate dynamic step sizes and penalty parameters, proving that SMADMM maintains its optimal complexity without the need for large initial batches. We also develop PnP-SMADMM by integrating plug-and-play priors, and establish its theoretical convergence under mild assumptions. Extensive experiments on classification, CT image reconstruction, and phase retrieval tasks demonstrate that our approach outperforms existing stochastic ADMM methods both in accuracy and efficiency, validating our theoretical results.

Paper Structure

This paper contains 17 sections, 12 theorems, 85 equations, 3 figures, 6 tables, 2 algorithms.

Key Result

Theorem 3.1

\newlabeltheorem:constant0 Suppose that Assumptions assm:lipsciz-assum:variance hold. Let the sequence $\left\{x_k, y_k, \lambda_k\right\}_{k=1}^K$ be generated by Algorithm alg:sam. Assume that and $m = \lceil \rho\rceil$, where $\phi_{\min}$ and $\phi_{\max}$ denote the smallest and largest eigenvalues of positive definite matrix $Q$, $\sigma_A$ denotes the smallest eigenvalues of matrix $A A^

Figures (3)

  • Figure 1: Comparison of epoch-wise trends for five algorithms across four datasets.
  • Figure 2: Visual comparison of 180 views CT reconstruction with RED-SD and PnP-SADMM. The input SNR is $50$ dB, and the batch size is set to 5.
  • Figure 3: Performance Comparison of CT image reconstruction over iterations with 5 minibatch sizes.

Theorems & Definitions (23)

  • Definition 2.1
  • Definition 2.7: stochastic first-order oracle
  • Theorem 3.1
  • Theorem 3.2
  • Lemma 4.1: xu2015augmented, Lemma 2
  • Lemma 4.2
  • Proof 1
  • Lemma 4.3
  • Proof 2
  • Lemma 4.4
  • ...and 13 more