Table of Contents
Fetching ...

A flexible block-coordinate forward-backward algorithm for non-smooth and non-convex optimization

Luis Briceño-Arias, Paulo Gonçalves, Guillaume Lauga, Nelly Pustelnik, Elisa Riccietti

TL;DR

This work introduces FLEX-BC-PG, a flexible, deterministic block-coordinate forward-backward algorithm that supports parallel and essentially cyclic updates for non-smooth, non-convex objectives $\Psi(\mathbf{x})=f(\mathbf{x})+\sum_{\ell} g_\ell(x_\ell)$. The authors establish state-of-the-art convergence guarantees under the Kurdyka-Łojasiewicz framework, including descent, bounded iterates, and convergence of iterates to a critical point, with rates determined by the KL exponent. A key contribution is linking FLEX-BC-PG to multilevel image restoration via first-order coherence, allowing hierarchical block updates that mimic multilevel optimizations while retaining BC-PG convergence properties. The paper demonstrates practical benefits on wavelet-based image deblurring tasks, showing that FLEX-BC-PG and its multilevel variants outperform standard forward-backward and traditional BC-PG schemes, highlighting the method’s potential for large-scale, structured optimization in imaging and beyond.

Abstract

Block coordinate descent (BCD) methods are prevalent in large scale optimization problems due to the low memory and computational costs per iteration, the predisposition to parallelization, and the ability to exploit the structure of the problem. The theoretical and practical performance of BCD relies heavily on the rules defining the choice of the blocks to be updated at each iteration. We propose a new deterministic BCD framework that allows for very flexible updates, while guaranteeing state-of-the-art convergence guarantees on non-smooth nonconvex optimization problems. While encompassing several update rules from the literature, this framework allows for priority on updates of particular blocks and correlations in the block selection between iterations, which is not permitted under the classical convergent stochastic framework. This flexibility is leveraged in the context of multilevel optimization algorithms and, in particular, in multilevel image restoration problems, where the efficiency of the approach is illustrated.

A flexible block-coordinate forward-backward algorithm for non-smooth and non-convex optimization

TL;DR

This work introduces FLEX-BC-PG, a flexible, deterministic block-coordinate forward-backward algorithm that supports parallel and essentially cyclic updates for non-smooth, non-convex objectives . The authors establish state-of-the-art convergence guarantees under the Kurdyka-Łojasiewicz framework, including descent, bounded iterates, and convergence of iterates to a critical point, with rates determined by the KL exponent. A key contribution is linking FLEX-BC-PG to multilevel image restoration via first-order coherence, allowing hierarchical block updates that mimic multilevel optimizations while retaining BC-PG convergence properties. The paper demonstrates practical benefits on wavelet-based image deblurring tasks, showing that FLEX-BC-PG and its multilevel variants outperform standard forward-backward and traditional BC-PG schemes, highlighting the method’s potential for large-scale, structured optimization in imaging and beyond.

Abstract

Block coordinate descent (BCD) methods are prevalent in large scale optimization problems due to the low memory and computational costs per iteration, the predisposition to parallelization, and the ability to exploit the structure of the problem. The theoretical and practical performance of BCD relies heavily on the rules defining the choice of the blocks to be updated at each iteration. We propose a new deterministic BCD framework that allows for very flexible updates, while guaranteeing state-of-the-art convergence guarantees on non-smooth nonconvex optimization problems. While encompassing several update rules from the literature, this framework allows for priority on updates of particular blocks and correlations in the block selection between iterations, which is not permitted under the classical convergent stochastic framework. This flexibility is leveraged in the context of multilevel optimization algorithms and, in particular, in multilevel image restoration problems, where the efficiency of the approach is illustrated.

Paper Structure

This paper contains 37 sections, 15 theorems, 90 equations, 4 figures.

Key Result

Proposition 3.1

Subdifferentiability property VarAnalRockafellar. Let $\Psi$ be defined as in problem eq5:optim. Then, for all $x = (x_1,\ldots,x_L)\in \mathcal{H}_1 \times \ldots \times\mathcal{H}_L$, we have

Figures (4)

  • Figure 1: Examples of update rules that are covered by our proposed algorithm $\mathtt{FLEX-BC-PG}$ for a problem with $4$ blocks. The top row displays existing rules that are covered by our framework, the bottom displays new rules that are now covered by our framework. The variable of the $\ell$-th block at iteration $n$ is indexed as $x_\ell^n$. At the top of each scheme, we display the iteration number $n$, and at the bottom the cycle number $k$. At each iteration we highlight in red the blocks that are updated. Thus, each column depicts the activation of the blocks at a given iteration. All these rules share a common feature, necessary to make the corresponding BCD method convergent: each block must have been updated at least once during a cycle. Here, each cycle contains $4$ iterations, but it is not necessary for the number of iterations per cycle to be equal to the number of blocks.
  • Figure 2: Decomposition on two levels of an image $\mathbf{u}$ with a wavelet transform. We regroup the detail coefficients $d_1,d_2$ and $d_3$ into one single block $d$ to simplify the presentation of our two level or two block proximal gradient descent algorithms.
  • Figure 3: Update scheme of the two block-coordinate descent algorithm. The blocks are updated in a cyclic fashion, first with the approximation block updated alone ($a^1$ in red), then the approximation and detail blocks updated together ($a^2$ and $d^2$ in red). We represent two cycles in this figure $k=1$ and $k=2$ for a total of $n=4$ iterations.
  • Figure 4: Comparison of the convergence of $\mathtt{FB}$ (red), $\mathtt{cyclic ~BC-PG}$ (green), $\mathtt{random ~ BC-PG}$ (dark green), $\mathtt{FLEX-BC-PG}$ (black), $\mathtt{Stochastic}$$\mathtt{FLEX-BC-PG}$ (cyan), and $\mathtt{Alternating ~(PI)}$$\mathtt{FLEX-BC-PG}$ (blue) for the deconvolution problem regularized with $2$-Level log sum-Haar wavelet on a $1024 \times 1024$ image of the Cameraman. Degradation: Gaussian noise with $\sigma_{\mathrm{noise}} = 0.01$ and a Gaussian blur of size $40\times 40$ and $7$ standard deviation. Parameters choice: $\lambda_a = 1 \times 10^{-10}$, $\lambda_d = 1 \times 10^{-4}$, $m=8$.

Theorems & Definitions (32)

  • Definition 1
  • Proposition 3.1
  • Definition 2
  • Definition 3
  • Lemma 3.2
  • Remark 3.3
  • Proposition 3.4
  • proof
  • Remark 3.5
  • Lemma 3.6
  • ...and 22 more