Table of Contents
Fetching ...

Unsupervised Training of Convex Regularizers using Maximum Likelihood Estimation

Hong Ye Tan, Ziruo Cai, Marcelo Pereyra, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb

TL;DR

This work tackles unsupervised learning for image reconstruction by learning a convex regularizer $g_\theta(x)$ via maximum marginal likelihood estimation. Reconstruction is framed as MAP with $p(x|y,\theta) \propto \ell(y|x)\,p(x|\theta)$, and the gradient of the marginal likelihood is estimated using two coupled MCMC samplers (posterior and prior) through Fisher's identity. The convex ridge regularizer enables convergence guarantees for the stochastic approximation procedure (SAPG-ULA), scales to dataset-size training with mini-batching, and yields priors that are nearly competitive with supervised methods while offering strong generalization to unseen forward operators. Empirical results on Gaussian deconvolution and Poisson denoising demonstrate favorable trade-offs against end-to-end and classical priors, with theoretical convergence supporting the reliability of the learning dynamics.

Abstract

Imaging is a standard example of an inverse problem, where the task of reconstructing a ground truth from a noisy measurement is ill-posed. Recent state-of-the-art approaches for imaging use deep learning, spearheaded by unrolled and end-to-end models and trained on various image datasets. However, many such methods require the availability of ground truth data, which may be unavailable or expensive, leading to a fundamental barrier that can not be bypassed by choice of architecture. Unsupervised learning presents an alternative paradigm that bypasses this requirement, as they can be learned directly on noisy data and do not require any ground truths. A principled Bayesian approach to unsupervised learning is to maximize the marginal likelihood with respect to the given noisy measurements, which is intrinsically linked to classical variational regularization. We propose an unsupervised approach using maximum marginal likelihood estimation to train a convex neural network-based image regularization term directly on noisy measurements, improving upon previous work in both model expressiveness and dataset size. Experiments demonstrate that the proposed method produces priors that are near competitive when compared to the analogous supervised training method for various image corruption operators, maintaining significantly better generalization properties when compared to end-to-end methods. Moreover, we provide a detailed theoretical analysis of the convergence properties of our proposed algorithm.

Unsupervised Training of Convex Regularizers using Maximum Likelihood Estimation

TL;DR

This work tackles unsupervised learning for image reconstruction by learning a convex regularizer via maximum marginal likelihood estimation. Reconstruction is framed as MAP with , and the gradient of the marginal likelihood is estimated using two coupled MCMC samplers (posterior and prior) through Fisher's identity. The convex ridge regularizer enables convergence guarantees for the stochastic approximation procedure (SAPG-ULA), scales to dataset-size training with mini-batching, and yields priors that are nearly competitive with supervised methods while offering strong generalization to unseen forward operators. Empirical results on Gaussian deconvolution and Poisson denoising demonstrate favorable trade-offs against end-to-end and classical priors, with theoretical convergence supporting the reliability of the learning dynamics.

Abstract

Imaging is a standard example of an inverse problem, where the task of reconstructing a ground truth from a noisy measurement is ill-posed. Recent state-of-the-art approaches for imaging use deep learning, spearheaded by unrolled and end-to-end models and trained on various image datasets. However, many such methods require the availability of ground truth data, which may be unavailable or expensive, leading to a fundamental barrier that can not be bypassed by choice of architecture. Unsupervised learning presents an alternative paradigm that bypasses this requirement, as they can be learned directly on noisy data and do not require any ground truths. A principled Bayesian approach to unsupervised learning is to maximize the marginal likelihood with respect to the given noisy measurements, which is intrinsically linked to classical variational regularization. We propose an unsupervised approach using maximum marginal likelihood estimation to train a convex neural network-based image regularization term directly on noisy measurements, improving upon previous work in both model expressiveness and dataset size. Experiments demonstrate that the proposed method produces priors that are near competitive when compared to the analogous supervised training method for various image corruption operators, maintaining significantly better generalization properties when compared to end-to-end methods. Moreover, we provide a detailed theoretical analysis of the convergence properties of our proposed algorithm.
Paper Structure (21 sections, 6 theorems, 50 equations, 3 figures, 4 tables, 2 algorithms)

This paper contains 21 sections, 6 theorems, 50 equations, 3 figures, 4 tables, 2 algorithms.

Key Result

Theorem 1

Suppose that $-\log p(y|\theta)$ is convex w.r.t. $\theta$, and that asm:compact_setasm:Lipschitz_fasm:sum_of_integrals hold. Under certain technical Lipschitz conditions and decaying step-sizes, a single sample is sufficient, i.e., $m_n=1$ leads to almost sure convergence of $(\theta_n)_{n \in \mat

Figures (3)

  • Figure 1: Visual comparison of various reconstructions of a blurred test image with Gaussian noise. The unsupervised MMSE (h) and MAP (i) reconstructions of the proposed SAPG method contain visual artifacts compared to the gradient-step training method. The standard deviation (j) shows the uncertainty around edges. We observe that EI has the sharpest looking image, which is due to explicit knowledge of the forward operator as a blur kernel, as well as having a much richer end-to-end parameterization.
  • Figure 2: Visual comparison of reconstructions for Poisson denoising. The proposed unsupervised SAPG method, as shown in subfigures (\ref{['subfig:poissonMMSE']}) and (\ref{['subfig:poissonMAP']}) both show significant denoising, but with the presence of artifacts. EI has color artifacts while the supervised GS method has more textural artifacts. DIP has a strong smoothing effect, but also induces strange visual artifacts around the target as shown in subfigure (\ref{['subfig:poissonDIPUnet']}).
  • Figure 3: Visual comparison of reconstructions for uniform deconvolution when using models trained for Gaussian deconvolution. We observe more residual blur for EI when compared to the model-based priors in the bottom row.

Theorems & Definitions (10)

  • Theorem 1: de2020maximum
  • Theorem 2
  • Proposition 1
  • proof : Proof sketch.
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof