Table of Contents
Fetching ...

Learning Difference-of-Convex Regularizers for Inverse Problems: A Flexible Framework with Theoretical Guarantees

Yasi Zhang, Oscar Leong

TL;DR

This work tackles ill-posed inverse problems by learning flexible, DC-structured regularizers that balance modeling power with theoretical guarantees. The regularizers are parameterized as a difference of two Input Convex Neural Networks (IDCNs), enabling expressive priors while permitting optimization via Difference-of-Convex Algorithm (DCA) and Proximal Subgradient Method (PSM) with convergence guarantees under mild smoothness or KL conditions. A star-geometry analysis characterizes when optimal regularizers are DC via alpha-homogeneous gauges and dual mixed volumes, establishing conditions under which the learned regularizers admit DC decompositions. Empirical validation on Computed Tomography (CT) reconstruction demonstrates that the proposed ADCR framework, including DCA and PSM variants, achieves state-of-the-art performance in sparse-view and limited-angle settings, highlighting the practical impact of DC-regularized learning for ill-posed imaging tasks.

Abstract

Learning effective regularization is crucial for solving ill-posed inverse problems, which arise in a wide range of scientific and engineering applications. While data-driven methods that parameterize regularizers using deep neural networks have demonstrated strong empirical performance, they often result in highly nonconvex formulations that lack theoretical guarantees. Recent work has shown that incorporating structured nonconvexity into neural network-based regularizers, such as weak convexity, can strike a balance between empirical performance and theoretical tractability. In this paper, we demonstrate that a broader class of nonconvex functions, difference-of-convex (DC) functions, can yield improved empirical performance while retaining strong convergence guarantees. The DC structure enables the use of well-established optimization algorithms, such as the Difference-of-Convex Algorithm (DCA) and a Proximal Subgradient Method (PSM), which extend beyond standard gradient descent. Furthermore, we provide theoretical insights into the conditions under which optimal regularizers can be expressed as DC functions. Extensive experiments on computed tomography (CT) reconstruction tasks show that our approach achieves strong performance across sparse and limited-view settings, consistently outperforming other weakly supervised learned regularizers. Our code is available at \url{https://github.com/YasminZhang/ADCR}.

Learning Difference-of-Convex Regularizers for Inverse Problems: A Flexible Framework with Theoretical Guarantees

TL;DR

This work tackles ill-posed inverse problems by learning flexible, DC-structured regularizers that balance modeling power with theoretical guarantees. The regularizers are parameterized as a difference of two Input Convex Neural Networks (IDCNs), enabling expressive priors while permitting optimization via Difference-of-Convex Algorithm (DCA) and Proximal Subgradient Method (PSM) with convergence guarantees under mild smoothness or KL conditions. A star-geometry analysis characterizes when optimal regularizers are DC via alpha-homogeneous gauges and dual mixed volumes, establishing conditions under which the learned regularizers admit DC decompositions. Empirical validation on Computed Tomography (CT) reconstruction demonstrates that the proposed ADCR framework, including DCA and PSM variants, achieves state-of-the-art performance in sparse-view and limited-angle settings, highlighting the practical impact of DC-regularized learning for ill-posed imaging tasks.

Abstract

Learning effective regularization is crucial for solving ill-posed inverse problems, which arise in a wide range of scientific and engineering applications. While data-driven methods that parameterize regularizers using deep neural networks have demonstrated strong empirical performance, they often result in highly nonconvex formulations that lack theoretical guarantees. Recent work has shown that incorporating structured nonconvexity into neural network-based regularizers, such as weak convexity, can strike a balance between empirical performance and theoretical tractability. In this paper, we demonstrate that a broader class of nonconvex functions, difference-of-convex (DC) functions, can yield improved empirical performance while retaining strong convergence guarantees. The DC structure enables the use of well-established optimization algorithms, such as the Difference-of-Convex Algorithm (DCA) and a Proximal Subgradient Method (PSM), which extend beyond standard gradient descent. Furthermore, we provide theoretical insights into the conditions under which optimal regularizers can be expressed as DC functions. Extensive experiments on computed tomography (CT) reconstruction tasks show that our approach achieves strong performance across sparse and limited-view settings, consistently outperforming other weakly supervised learned regularizers. Our code is available at \url{https://github.com/YasminZhang/ADCR}.

Paper Structure

This paper contains 42 sections, 5 theorems, 53 equations, 7 figures, 2 tables, 2 algorithms.

Key Result

Proposition 3.1

If $F = G - H$ is the difference of two Lipschitz convex functions over a compact domain $K$, then, for any $\varepsilon > 0$, there exists an IDCNN $\mathcal{R}_1-\mathcal{R}_2$ such that

Figures (7)

  • Figure 1: Contour plots comparing the distance function to a data manifold with learned regularizers for denoising spiral manifold data, using standard convex (ICNN), weakly convex (IWCNN), general nonconvex (NN) and difference-of-convex (IDCNN) adversarial regularization. The DC regularizer, which generalizes the weakly convex case, demonstrates improved generalization and fit to the data manifold.
  • Figure 2: Qualitative comparison of reconstructed images obtained using different methods, along with the associated PSNR and SSIM, for sparse-view Computed Tomography. ADCR and its variant, ADCR-PSM, successfully recover the fine structure of the groundtruth image more effectively than the other weakly supervised methods, capturing intricate details that significantly enhance visual fidelity.
  • Figure 3: Ablation on inner-loop iteration number $N$ of DCA in the limited-view setting. DCA consistently improve the ADCR across various choices of the inner-loop iteration number $N$. Results in the sparse-view setting are similar. The dashed lines represent the results of ADCR obtained through gradient descent. For DCA, we choose the optimal $N=6$ for the limited-view setting and $N=5$ for the sparse-view setting.
  • Figure 4: We visualize the geometry of a DC regularizer's level set in terms of its individual convex components. In particular, the gauge of $K$ can be written as the difference of two convex gauges, induced by the sets $M_1$ and $C$. $M_1$ can also be understood as a particular type of star body addition between $K$ and $C$. In directions where the boundaries $M_1$ and $C$ are close, $K$ is "spiky" and its boundary is further from the origin.
  • Figure 5: Ablation on inner-loop iteration number $N$ of PSM in the limited-view setting. PSM consistently improves the ADCR across various choices of the inner-loop iteration number $N$. Results in the sparse-view setting are similar. The dashed lines represent the results of ADCR obtained through gradient descent. For PSM, we choose $N=1$ for both settings.
  • ...and 2 more figures

Theorems & Definitions (13)

  • Proposition 3.1
  • proof
  • Theorem 4.1
  • Definition 4.2: Lutwak96
  • Corollary 4.3
  • Theorem 5.1
  • Theorem 5.2
  • Definition A.1: Definition 2* in Lutwak1975
  • proof : Proof of Theorem \ref{['thm:alpha-hom-theorem']}
  • proof : Proof of Corollary \ref{['cor:dc-result']}
  • ...and 3 more