Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

Ling Liang; Cameron Austin; Haizhao Yang

Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

Ling Liang, Cameron Austin, Haizhao Yang

TL;DR

This work addresses accelerating convergence of multi-block ADMM-type methods by integrating Learning to Optimize (L2O) into the Majorized Proximal Augmented Lagrangian Method (MPALM). It introduces a supervised-learning framework to adaptively select the penalty parameters $\{\sigma_j\}$, preserves the convergence guarantees of MPALM via a symmetric Gauss-Seidel-based proximal update, and demonstrates superior empirical performance on Lasso and discrete optimal transport compared with fixed-parameter MPALM and standard baselines. The approach leverages the majorization $q_\xi(y;y')$ and the SGS decomposition to enable efficient, block-separable updates, while the ERM training tunes hyperparameters to the distribution of problem instances. The resulting method provides a flexible, data-driven pathway to apply convergent MPALM to large-scale, linearly constrained composite problems, with potential impact across sparse regression, transport optimization, and beyond.

Abstract

Learning to Optimize (L2O) approaches, including algorithm unrolling, plug-and-play methods, and hyperparameter learning, have garnered significant attention and have been successfully applied to the Alternating Direction Method of Multipliers (ADMM) and its variants. However, the natural extension of L2O to multi-block ADMM-type methods remains largely unexplored. Such an extension is critical, as multi-block methods leverage the separable structure of optimization problems, offering substantial reductions in per-iteration complexity. Given that classical multi-block ADMM does not guarantee convergence, the Majorized Proximal Augmented Lagrangian Method (MPALM), which shares a similar form with multi-block ADMM and ensures convergence, is more suitable in this setting. Despite its theoretical advantages, MPALM's performance is highly sensitive to the choice of penalty parameters. To address this limitation, we propose a novel L2O approach that adaptively selects this hyperparameter using supervised learning. We demonstrate the versatility and effectiveness of our method by applying it to the Lasso problem and the optimal transport problem. Our numerical results show that the proposed framework outperforms popular alternatives. Given its applicability to generic linearly constrained composite optimization problems, this work opens the door to a wide range of potential real-world applications.

Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

TL;DR

, preserves the convergence guarantees of MPALM via a symmetric Gauss-Seidel-based proximal update, and demonstrates superior empirical performance on Lasso and discrete optimal transport compared with fixed-parameter MPALM and standard baselines. The approach leverages the majorization

and the SGS decomposition to enable efficient, block-separable updates, while the ERM training tunes hyperparameters to the distribution of problem instances. The resulting method provides a flexible, data-driven pathway to apply convergent MPALM to large-scale, linearly constrained composite problems, with potential impact across sparse regression, transport optimization, and beyond.

Abstract

Paper Structure (17 sections, 3 theorems, 32 equations, 2 figures, 6 algorithms)

This paper contains 17 sections, 3 theorems, 32 equations, 2 figures, 6 algorithms.

Introduction
Optimization model
Our contributions
Related work
The majorized proximal augmented Lagrangian method
Hyperparameter learning
Application to classical Lasso problems
Application to optimal transport problems
Results
Conclusion
The two-block ADMM for problem (P(ξ))
Direct multi-block extension of the two-block ADMM
MPALM for Lasso
MPALM for optimal transport
Detailed experimental settings
...and 2 more sections

Key Result

Theorem 1

Suppose that Assumptions kkt-solvable and f-Lip hold and $\mathcal{S}:\mathbb{Y}\to \mathbb{Y}$ is a given self-adjoint linear operator such that $\frac{1}{2}\Sigma + \sigma\mathcal{A}\mathcal{A}^* + \mathcal{S} \succ 0$, and $\mathcal{S}\succeq -\frac{1}{2}\Sigma$. Let $\{(x^k, y^k)\}$ be the seque

Figures (2)

Figure 1: Lasso: normalized MSE for problem sizes $(m,n) = (10,20)$, $(10,100)$, $(10,200)$, $(20,100)$
Figure 2: Optimal transport: NMSE for randomly generated data, with $m=n=196$ and MNIST image data set, with $m=n=49$. The first and third figures: LMPALM; The second and fourth figures: Sinkhorn's algorithm.

Theorems & Definitions (4)

Theorem 1
Theorem 2: Minimizing the ALM subproblem li2019block
Lemma 1: Dual problem of \ref{['classical-lasso']}
Remark 1

Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

TL;DR

Abstract

Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (4)