Table of Contents
Fetching ...

Optimizing ADMM and Over-Relaxed ADMM Parameters for Linear Quadratic Problems

Jintao Song, Wenqi Lu, Yunwen Lei, Yuchao Tang, Zhenkuan Pan, Jinming Duan

TL;DR

This work tackles the challenge of selecting optimal ADMM and over-relaxed ADMM parameters for linear-quadratic problems. By formulating ADMM as a fixed-point iteration and analyzing the spectral radius of the iteration matrix, the authors derive a numerical gradient-descent method to optimize the penalty parameter $θ$ and a closed-form solution for the relaxation parameter $α^*$. They prove unconditional convergence for ADMM on LQPs and reduce the joint optimization to a single-variable problem in $θ$, enabling efficient pre-iteration parameter tuning. The methods are validated on random problem instances and imaging applications (diffeomorphic image registration, image deblurring, and MRI reconstruction), showing faster convergence and practical improvements over baselines. The proposed framework offers a principled, generalizable approach to parameter selection that can extend beyond quadratic problems to other convex/non-smooth settings.

Abstract

The Alternating Direction Method of Multipliers (ADMM) has gained significant attention across a broad spectrum of machine learning applications. Incorporating the over-relaxation technique shows potential for enhancing the convergence rate of ADMM. However, determining optimal algorithmic parameters, including both the associated penalty and relaxation parameters, often relies on empirical approaches tailored to specific problem domains and contextual scenarios. Incorrect parameter selection can significantly hinder ADMM's convergence rate. To address this challenge, in this paper we first propose a general approach to optimize the value of penalty parameter, followed by a novel closed-form formula to compute the optimal relaxation parameter in the context of linear quadratic problems (LQPs). We then experimentally validate our parameter selection methods through random instantiations and diverse imaging applications, encompassing diffeomorphic image registration, image deblurring, and MRI reconstruction.

Optimizing ADMM and Over-Relaxed ADMM Parameters for Linear Quadratic Problems

TL;DR

This work tackles the challenge of selecting optimal ADMM and over-relaxed ADMM parameters for linear-quadratic problems. By formulating ADMM as a fixed-point iteration and analyzing the spectral radius of the iteration matrix, the authors derive a numerical gradient-descent method to optimize the penalty parameter and a closed-form solution for the relaxation parameter . They prove unconditional convergence for ADMM on LQPs and reduce the joint optimization to a single-variable problem in , enabling efficient pre-iteration parameter tuning. The methods are validated on random problem instances and imaging applications (diffeomorphic image registration, image deblurring, and MRI reconstruction), showing faster convergence and practical improvements over baselines. The proposed framework offers a principled, generalizable approach to parameter selection that can extend beyond quadratic problems to other convex/non-smooth settings.

Abstract

The Alternating Direction Method of Multipliers (ADMM) has gained significant attention across a broad spectrum of machine learning applications. Incorporating the over-relaxation technique shows potential for enhancing the convergence rate of ADMM. However, determining optimal algorithmic parameters, including both the associated penalty and relaxation parameters, often relies on empirical approaches tailored to specific problem domains and contextual scenarios. Incorrect parameter selection can significantly hinder ADMM's convergence rate. To address this challenge, in this paper we first propose a general approach to optimize the value of penalty parameter, followed by a novel closed-form formula to compute the optimal relaxation parameter in the context of linear quadratic problems (LQPs). We then experimentally validate our parameter selection methods through random instantiations and diverse imaging applications, encompassing diffeomorphic image registration, image deblurring, and MRI reconstruction.
Paper Structure (12 sections, 4 theorems, 89 equations, 5 figures, 2 tables)

This paper contains 12 sections, 4 theorems, 89 equations, 5 figures, 2 tables.

Key Result

Theorem 1

In order to determine the optimal penalty parameter $\theta^*$ in ADMM automatically, we need to transform the ADMM iterations in Algorithm 1 into the following fixed-point iteration system, solely with respect to the variable $u$ where $I+Q$ is the iteration matrix with $Q$ defined as Next, given a value of $\mu$, we can prove regardless of the value of $\theta$. As per Section 3.1, we know tha

Figures (5)

  • Figure 1: Relationship between $|1 + \alpha \lambda_i(Q)|$ and the value of $\alpha$. The slope of each line before reflection is $\lambda_i\left(Q\right)$. The spectral radius before the intersection point is governed by the green line, while after the reflection, it is determined by the reflected red line. The intersection point corresponds to the optimal $\alpha^*$ as well as the minimum spectral radius of the iteration matrix $I + \alpha Q$.
  • Figure 2: Left: Convergence rates of different methods and parameter values based on 1 random instantiation of $A$ and $L$. Right: Convergence rates based on 50 random instantiations of $A$ and $L$. The solid lines represent the average over 50 instantiations. The algorithm is ADMM when $\alpha=1$, and oADMM when $\alpha = \alpha^*$.
  • Figure 3: Illustration of diffeomorphic image registration results, visualization of the correlation between spectral radius and $\theta$, and comparison of convergence rates of algorithms. The $x$-axes of the two plots in the third column represent the values of $\theta$ and iteration numbers, respectively.
  • Figure 4: Demonstration of image deblurring effects and convergence rates of different algorithms. The $x$-axis and $y$-axis of each plot in the second row represent iteration numbers and ${\rm{log}}(\|u^k - u^*\|)$, respectively.
  • Figure 5: Demonstration of MRI reconstruction results and comparison of convergence rates among algorithms. The $x$-axis and $y$-axis of each plot in the second row represent iteration numbers and ${\rm{log}}(\|u^k - u^*\|)$, respectively.

Theorems & Definitions (8)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Theorem 4
  • proof