Learned Regularization for Inverse Problems: Insights from a Spectral Model
Martin Burger, Samira Kabri
TL;DR
This work develops a spectral-regularization framework for data-driven inverse problems, formalizing how learned regularizers can converge to the generalized inverse $A^{\dagger}$ in infinite-dimensional spaces. It derives the mean-squared-error optimal spectral coefficients $g^{\text{mse}}_n(\mu,\pi)=\dfrac{\sigma_n}{\sigma_n^2+\Delta_n(\mu)/\Pi_n(\pi)}$ and proves convergence of the corresponding learned reg-ularizer $R^{\text{mse}}_{\mu}$ under a parameter rule as the noise level $\delta$ vanishes, with training noise $\Delta_n(\mu(\delta))$ and problem noise $\Delta_n(\nu^\delta)$ constrained in specific ways. It then adapts this framework to plug-and-play denoising (post-processing and proximal-map) and adversarial regularization (with and without a source condition), showing that the same optimal mse-regularizer can be achieved by several approaches provided the training noise is chosen appropriately; white-noise training generally yields convergence under mild operator/data assumptions. The paper offers concrete continuity and convergence conditions, compares approaches, and supports the theory with numerical CT-style experiments, offering practical guidance for designing stable, data-driven regularizers in high-dimensional inverse problems.
Abstract
In this chapter we provide a theoretically founded investigation of state-of-the-art learning approaches for inverse problems from the point of view of spectral reconstruction operators. We give an extended definition of regularization methods and their convergence in terms of the underlying data distributions, which paves the way for future theoretical studies. Based on a simple spectral learning model previously introduced for supervised learning, we investigate some key properties of different learning paradigms for inverse problems, which can be formulated independently of specific architectures. In particular we investigate the regularization properties, bias, and critical dependence on training data distributions. Moreover, our framework allows to highlight and compare the specific behavior of the different paradigms in the infinite-dimensional limit.
