Efficient iterative methods for hyperparameter estimation in large-scale linear inverse problems
Khalil A Hall-Hooper, Arvind K Saibaba, Julianne Chung, Scot M Miller
TL;DR
The paper addresses hyperparameter estimation in large-scale linear-Gaussian inverse problems by combining an empirical Bayes (EB) approach with generalized Golub-Kahan (genGK) bidiagonalization to form low-rank surrogates. It derives approximations to the EB objective and its gradient, $\tilde{\mathcal{F}}_k$ and $\widetilde{\nabla \mathcal{F}}_k$, that rely on a rank-$k$ reduction $\mathbf{A} \approx \mathbf{U}_{k+1} \mathbf{B}_k \mathbf{V}_k^\top$ and avoid explicit computation of square roots or inverses of the prior covariance. The work provides rigorous error bounds, a posteriori error estimators via Monte Carlo trace techniques, and practical stopping criteria for the genGK iterations. Numerical experiments on inverse heat transfer (1D), seismic tomography (2D), and atmospheric tomography demonstrate accurate hyperparameter recovery and substantial speedups over full-matrix computations, highlighting the method’s robustness when noise and prior variances are unknown. The approach is general and can inform fully Bayesian inference, variational Bayes, and information-theoretic design for hyperparameters in large-scale problems, enabling scalable uncertainty quantification in geophysical applications.
Abstract
We study Bayesian methods for large-scale linear inverse problems, focusing on the challenging task of hyperparameter estimation. Typical hierarchical Bayesian formulations that follow a Markov Chain Monte Carlo approach are possible for small problems with very few hyperparameters but are not computationally feasible for problems with a very large number of unknown parameters. In this work, we describe an empirical Bayesian (EB) method to estimate hyperparameters that maximize the marginal posterior, i.e., the probability density of the hyperparameters conditioned on the data, and then we use the estimated values to compute the posterior of the inverse parameters. For problems where the computation of the square root and inverse of prior covariance matrices are not feasible, we describe an approach based on the generalized Golub-Kahan bidiagonalization to approximate the marginal posterior and seek hyperparameters that minimize the approximate marginal posterior. Numerical results from seismic and atmospheric tomography demonstrate the accuracy, robustness, and potential benefits of the proposed approach.
