Table of Contents
Fetching ...

Distributed Tikhonov regularization for ill-posed inverse problems from a Bayesian perspective

Daniela Calvetti, Erkki Somersalo

TL;DR

This work tackles ill-posed linear inverse problems by marrying Tikhonov regularization with Bayesian hierarchical modeling to enable distributed, componentwise regularization. The Iterative Alternating Sequential (IAS) algorithm alternates between a Tikhonov-type LS update for transformed coefficients and a separable hyperprior update for the variances, with closed-form updates in common cases and a hybrid scheme for nonconvex settings. By identifying MAP estimation with regularization parameter choice, the authors provide a unified framework that supports matrix-free forward models and leverages Krylov-Lanczos solvers for scalability, including efficient computation of the pseudoinverse of the regularized operator. The approach yields results comparable to classical parameter-selection methods while enabling faster convergence and sparse/group-sparse reconstructions, demonstrated on numerical differentiation and 2D tomography tasks. A key practical impact is the ability to perform distributed regularization with efficient, QR-free computations using Lanczos-based techniques in large-scale or matrix-free contexts.

Abstract

We exploit the similarities between Tikhonov regularization and Bayesian hierarchical models to propose a regularization scheme that acts like a distributed Tikhonov regularization where the amount of regularization varies from component to component. In the standard formulation, Tikhonov regularization compensates for the inherent ill-conditioning of linear inverse problems by augmenting the data fidelity term measuring the mismatch between the data and the model output with a scaled penalty functional. The selection of the scaling is the core problem in Tikhonov regularization. If an estimate of the amount of noise in the data is available, a popular way is to use the Morozov discrepancy principle, stating that the scaling parameter should be chosen so as to guarantee that the norm of the data fitting error is approximately equal to the norm of the noise in the data. A too small value of the regularization parameter would yield a solution that fits to the noise while a too large value would lead to an excessive penalization of the solution. In many applications, it would be preferable to apply distributed regularization, replacing the regularization scalar by a vector valued parameter, allowing different regularization for different components of the unknown, or for groups of them. A distributed Tikhonov-inspired regularization is particularly well suited when the data have significantly different sensitivity to different components, or to promote sparsity of the solution. The numerical scheme that we propose, while exploiting the Bayesian interpretation of the inverse problem and identifying the Tikhonov regularization with the Maximum A Posteriori (MAP) estimation, requires no statistical tools. A combination of numerical linear algebra and optimization tools makes the scheme computationally efficient and suitable for problems where the matrix is not explicitly available.

Distributed Tikhonov regularization for ill-posed inverse problems from a Bayesian perspective

TL;DR

This work tackles ill-posed linear inverse problems by marrying Tikhonov regularization with Bayesian hierarchical modeling to enable distributed, componentwise regularization. The Iterative Alternating Sequential (IAS) algorithm alternates between a Tikhonov-type LS update for transformed coefficients and a separable hyperprior update for the variances, with closed-form updates in common cases and a hybrid scheme for nonconvex settings. By identifying MAP estimation with regularization parameter choice, the authors provide a unified framework that supports matrix-free forward models and leverages Krylov-Lanczos solvers for scalability, including efficient computation of the pseudoinverse of the regularized operator. The approach yields results comparable to classical parameter-selection methods while enabling faster convergence and sparse/group-sparse reconstructions, demonstrated on numerical differentiation and 2D tomography tasks. A key practical impact is the ability to perform distributed regularization with efficient, QR-free computations using Lanczos-based techniques in large-scale or matrix-free contexts.

Abstract

We exploit the similarities between Tikhonov regularization and Bayesian hierarchical models to propose a regularization scheme that acts like a distributed Tikhonov regularization where the amount of regularization varies from component to component. In the standard formulation, Tikhonov regularization compensates for the inherent ill-conditioning of linear inverse problems by augmenting the data fidelity term measuring the mismatch between the data and the model output with a scaled penalty functional. The selection of the scaling is the core problem in Tikhonov regularization. If an estimate of the amount of noise in the data is available, a popular way is to use the Morozov discrepancy principle, stating that the scaling parameter should be chosen so as to guarantee that the norm of the data fitting error is approximately equal to the norm of the noise in the data. A too small value of the regularization parameter would yield a solution that fits to the noise while a too large value would lead to an excessive penalization of the solution. In many applications, it would be preferable to apply distributed regularization, replacing the regularization scalar by a vector valued parameter, allowing different regularization for different components of the unknown, or for groups of them. A distributed Tikhonov-inspired regularization is particularly well suited when the data have significantly different sensitivity to different components, or to promote sparsity of the solution. The numerical scheme that we propose, while exploiting the Bayesian interpretation of the inverse problem and identifying the Tikhonov regularization with the Maximum A Posteriori (MAP) estimation, requires no statistical tools. A combination of numerical linear algebra and optimization tools makes the scheme computationally efficient and suitable for problems where the matrix is not explicitly available.
Paper Structure (15 sections, 3 theorems, 94 equations, 6 figures)

This paper contains 15 sections, 3 theorems, 94 equations, 6 figures.

Key Result

Theorem 4.1

The IAS algorithm with gamma hyperprior ($r=1$) converges to the unique minimizer $(\widehat{z},\widehat{\theta})$ of the Gibbs energy functional. Moreover, the minimizer $(\widehat{z},\widehat{\theta})$ satisfies the fixed point condition where $f$ is the map with $j$th component

Figures (6)

  • Figure 1: A schematic picture justifying the selection of hyperpriors from the generalized gamma family. For simplicity, assume that ${\mathsf L}={\mathsf I}$, and we require that $x$ is component-wise sparse. A component $x_j$ that is assumed to vanish or to be small is a priori thought of being drawn from a Gaussian with small variance $\theta_j$, while a large component $x_k$ is drawn from a Gaussian with large $\theta_k$. A priori, we do not know where the large values are, but for sparsity, they should be rare, which is in line with the assumption that $\Theta$ has independent components with a heavy tailed distribution.
  • Figure 2: Tikhonov-regularized solutions of the numerical differentiation using the two different methods for choosing the regularization parameter. The left panels show the noisy data, the middle panels the solutions based on Morozov discrepancy principle, and on the left, the solutions using the IAS-based criterion for the regularization parameter.
  • Figure 3: The left panel shows the logarithmic plot of the regularization parameter values using the two selection criteria as a function of relative noise level. The middle panel shows the number of times that the linear system needs to be solved for estimating the parameter, and on the right, the relative reconstruction error using the two methods are shown.
  • Figure 4: The geometric configuration of the fan beam tomography problem is shown schematically on the left, where three projection views are included. In the panel on the right, the true target is superimposed on the discretization mesh used in the inverse problem. The true densities of the inclusions are $\rho = 1.2$ in the pink kite shaped inclusion and $\rho=1$ in the blue circular inclusion. The number of interior nodes in this mesh is $n = 3\,821$, which is the number of unknowns in the inverse problem.
  • Figure 5: Six first iterations of the density reconstruction of the IAS algorithm (two first rows, lexicographical order), and the corresponding variances $\theta$ (two last rows, lexicographical order). After six iterations, the reconstructions remain visually very similar to the last reconstruction.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Theorem 4.1
  • Theorem 4.2
  • Lemma 5.1
  • proof