Table of Contents
Fetching ...

Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation

Zakhar Shumaylov, Jeremy Budd, Subhadip Mukherjee, Carola-Bibiane Schönlieb

TL;DR

This work develops a unified theory for convergent regularisation of inverse problems using weakly convex regularisers, focusing on convergence of critical points rather than global minimisers. It proves existence, stability, and convergence for the regularised problem, and establishes convergence and ergodic rates for the primal-dual hybrid gradient method under a Kurdyka-Lojasiewicz condition. A key contribution is the universal approximation of input weakly convex neural networks (IWCNNs), enabling adversarial weakly convex regularisers (AWCR) that retain a distance-function interpretation while preserving guarantees. The framework is validated on sparse-view and limited-angle CT, where AWCR and AWCR-PD achieve competitive or superior performance to state-of-the-art baselines, illustrating practical impact for data-driven regularisation in ill-posed imaging tasks.

Abstract

Variational regularisation is the primary method for solving inverse problems, and recently there has been considerable work leveraging deeply learned regularisation for enhanced performance. However, few results exist addressing the convergence of such regularisation, particularly within the context of critical points as opposed to global minimisers. In this paper, we present a generalised formulation of convergent regularisation in terms of critical points, and show that this is achieved by a class of weakly convex regularisers. We prove convergence of the primal-dual hybrid gradient method for the associated variational problem, and, given a Kurdyka-Lojasiewicz condition, an $\mathcal{O}(\log{k}/k)$ ergodic convergence rate. Finally, applying this theory to learned regularisation, we prove universal approximation for input weakly convex neural networks (IWCNN), and show empirically that IWCNNs can lead to improved performance of learned adversarial regularisers for computed tomography (CT) reconstruction.

Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation

TL;DR

This work develops a unified theory for convergent regularisation of inverse problems using weakly convex regularisers, focusing on convergence of critical points rather than global minimisers. It proves existence, stability, and convergence for the regularised problem, and establishes convergence and ergodic rates for the primal-dual hybrid gradient method under a Kurdyka-Lojasiewicz condition. A key contribution is the universal approximation of input weakly convex neural networks (IWCNNs), enabling adversarial weakly convex regularisers (AWCR) that retain a distance-function interpretation while preserving guarantees. The framework is validated on sparse-view and limited-angle CT, where AWCR and AWCR-PD achieve competitive or superior performance to state-of-the-art baselines, illustrating practical impact for data-driven regularisation in ill-posed imaging tasks.

Abstract

Variational regularisation is the primary method for solving inverse problems, and recently there has been considerable work leveraging deeply learned regularisation for enhanced performance. However, few results exist addressing the convergence of such regularisation, particularly within the context of critical points as opposed to global minimisers. In this paper, we present a generalised formulation of convergent regularisation in terms of critical points, and show that this is achieved by a class of weakly convex regularisers. We prove convergence of the primal-dual hybrid gradient method for the associated variational problem, and, given a Kurdyka-Lojasiewicz condition, an ergodic convergence rate. Finally, applying this theory to learned regularisation, we prove universal approximation for input weakly convex neural networks (IWCNN), and show empirically that IWCNNs can lead to improved performance of learned adversarial regularisers for computed tomography (CT) reconstruction.
Paper Structure (35 sections, 21 theorems, 98 equations, 4 figures, 1 table)

This paper contains 35 sections, 21 theorems, 98 equations, 4 figures, 1 table.

Key Result

Theorem 2.6

Let $\operatorname{\mathcal{R}} = R_{wc}+R_{sc}$ where $R_{wc}:\operatorname{\mathcal{X}}\to[0,\infty)$ is $\gamma$-weak convex and $R_{sc}$ is $\mu$-strongly convex. For any $\hat{x}$ a critical point of $\operatorname{\mathcal{R}}$, we have that for all $z\in\operatorname{\mathcal{X}}$: If $\gamma or if $R_{wc}$ is $L_R$-Lipschitz continuous, Here $\|\partial R_{sc}(z)\|:= \sup \{ \|\psi\|_{\op

Figures (4)

  • Figure 1: Contour plots comparing the distance function to a data manifold (top-left) with learned regularisers for denoising this data, via convex (bottom-left), standard (top-right), and weakly convex (bottom-right) adversarial regularisation. The convex regulariser lacks an interpretation as a distance function, whilst the weakly convex regulariser, introduced in this work, retains this feature and shows improved generalisation.
  • Figure 2: Reconstructed images obtained using different methods, along with the associated PSNR and SSIM, for sparse view CT. In this case the AWCR and AWCR-PD achieve the highest PSNR and SSIM. Furthermore, both AWCR methods retain the fine-structure in the reconstruction, unlike the ACNCR and ACR, the only other methods which possess convergence guarantees.
  • Figure 3: Example of a positive weakly convex function with $\gamma > 2\mu$ and unbounded stationary points.
  • Figure 4: Reconstructed images obtained using different methods, along with the associated PSNR and SSIM, for limited view CT.

Theorems & Definitions (53)

  • Definition 2.1: $\rho$-convexity
  • Definition 2.2: Subdifferential; see e.g. Kruger2003
  • Definition 2.4: $\operatorname{\mathcal{R}}$-minimising and $\operatorname{\mathcal{R}}$-criticising solutions
  • Example 2.5
  • Theorem 2.6: Bounded critical points
  • proof
  • Theorem 3.2: Existence
  • proof
  • Theorem 3.3: Stability
  • proof
  • ...and 43 more