Table of Contents
Fetching ...

Practical Acceleration of the Condat-Vũ Algorithm

Derek Driggs, Matthias J. Ehrhardt, Carola-Bibiane Schönlieb, Junqi Tang

TL;DR

This work shows that a simple adjustment to the Condat-V\~u algorithm allows it to recover accelerated PGD (APGD) as a special case, instead of PGD, and proves that this accelerated Condat--V\~u algorithm achieves optimal convergence rates and significantly outperforms the traditional Condat-V\~u algorithm in regimes where the Condat--V--u algorithm approximates the dynamics of PGD.

Abstract

The Condat-Vũ algorithm is a widely used primal-dual method for optimizing composite objectives of three functions. Several algorithms for optimizing composite objectives of two functions are special cases of Condat-Vũ, including proximal gradient descent (PGD). It is well-known that PGD exhibits suboptimal performance, and a simple adjustment to PGD can accelerate its convergence rate from $\mathcal{O}(1/T)$ to $\mathcal{O}(1/T^2)$ on convex objectives, and this accelerated rate is optimal. In this work, we show that a simple adjustment to the Condat-Vũ algorithm allows it to recover accelerated PGD (APGD) as a special case, instead of PGD. We prove that this accelerated Condat--Vũ algorithm achieves optimal convergence rates and significantly outperforms the traditional Condat-Vũ algorithm in regimes where the Condat--Vũ algorithm approximates the dynamics of PGD. We demonstrate the effectiveness of our approach in various applications in machine learning and computational imaging.

Practical Acceleration of the Condat-Vũ Algorithm

TL;DR

This work shows that a simple adjustment to the Condat-V\~u algorithm allows it to recover accelerated PGD (APGD) as a special case, instead of PGD, and proves that this accelerated Condat--V\~u algorithm achieves optimal convergence rates and significantly outperforms the traditional Condat-V\~u algorithm in regimes where the Condat--V--u algorithm approximates the dynamics of PGD.

Abstract

The Condat-Vũ algorithm is a widely used primal-dual method for optimizing composite objectives of three functions. Several algorithms for optimizing composite objectives of two functions are special cases of Condat-Vũ, including proximal gradient descent (PGD). It is well-known that PGD exhibits suboptimal performance, and a simple adjustment to PGD can accelerate its convergence rate from to on convex objectives, and this accelerated rate is optimal. In this work, we show that a simple adjustment to the Condat-Vũ algorithm allows it to recover accelerated PGD (APGD) as a special case, instead of PGD. We prove that this accelerated Condat--Vũ algorithm achieves optimal convergence rates and significantly outperforms the traditional Condat-Vũ algorithm in regimes where the Condat--Vũ algorithm approximates the dynamics of PGD. We demonstrate the effectiveness of our approach in various applications in machine learning and computational imaging.
Paper Structure (22 sections, 6 theorems, 90 equations, 9 figures, 1 table, 4 algorithms)

This paper contains 22 sections, 6 theorems, 90 equations, 9 figures, 1 table, 4 algorithms.

Key Result

Lemma 3.1

Suppose $g$ is $\mu_g$-strongly convex and $f^*$ is $\mu_{f^*}$-strongly convex with $\mu_g, \mu_{f^*}$$\ge 0$. The iterates of Algorithm alg:framework satisfy the following inequality for any $(x,y) \in \mathcal{X} \times \mathcal{Y}$:

Figures (9)

  • Figure 1: Performance comparison of the proposed Accelerated Condat--Vũ, Condat--Vũ with standard parameters settings, and Condat--Vũ with tuned parameters on problem \ref{['eq:ggfen']}. These results show that in the regime where $L \gg \|A\|_{\textnormal{op}}^2 / \mu_{f^*}$, ACV significantly outperforms CV, matching our theoretical convergence rates.
  • Figure 2: Performance comparison of ACV, Condat--Vũ with standard parameters settings CPrates, and Condat--Vũ with tuned parameters on problem \ref{['eq:ggfen']} with $\lambda_3 \to \infty$. These results show that in the regime where $L \gg \|A\|_{\textnormal{op}}^2$, ACV significantly outperforms CV and matches our theoretical convergence rates.
  • Figure 3: Results for the inpainting experiment (example 1). Here we compare the performance of CV with fine-tuned step-sizes and ACV with the step-size rule \ref{['step_size_default']}. In the second row we show the images reconstructed by the algorithms at the 1200th iteration.
  • Figure 4: Results for the inpainting experiment (example 2) with strong-convexity on $g$. Here we compare the performance of CV with fine-tuned step-sizes and ACV with all three step-size rules. We can observe that the ACV presented in Alg. 4.2 provide significant acceleration. When there is significant strong-convexity in the objective, it is crucial for the ACV to select the step-size rule which exploits the strong-convexity. In the second row we show the images reconstructed by the algorithms at the 1200th iteration.
  • Figure 5: Results for the deblurring experiment (example 1). Here we compare the performance of CV with fine-tuned step-sizes and ACV with the step-size rule \ref{['step_size_default']}. Here the solid black curve records the performance of ACV with the rescaling trick \ref{['res']}, while the dash curve is for ACV without using the rescaling trick (same for the following figures). In the second row we show the images reconstructed by the algorithms at the 200th iteration.
  • ...and 4 more figures

Theorems & Definitions (18)

  • Lemma 3.1
  • proof
  • Lemma 3.2: One-Iteration Progress
  • proof
  • Theorem 4.1
  • proof
  • Remark 4.2: Suggested Parameter Settings
  • Remark 4.3
  • Theorem 4.4
  • proof
  • ...and 8 more