On Optimal Regularization Parameters via Bilevel Learning
Matthias J. Ehrhardt, Silvia Gazzola, Sebastian J. Scott
TL;DR
This work investigates bilevel learning for optimal regularization parameter selection in variational regularization of linear inverse problems. It derives a new, sufficient positivity condition based on Bregman distances that guarantees the optimal parameter $\alpha$ is strictly positive when the forward operator is injective and the regularizer is convex, bounded below, and continuously differentiable; the condition uses $B=(A^TA)^{-1}$ and the LS solution $x^0$. The authors extend the analysis to an expected predictive-risk upper level with invertible $A$ and provide pointwise (denoising) corollaries, including a guaranteed positivity result under zero-mean noise when the regularizer is strictly convex. Numerical experiments in low- and high-dimensional settings validate the theory and show that the new condition offers a sharper characterization of positivity than existing criteria, with practical implications for regularizer design and parameter tuning in imaging tasks. Overall, the results deepen the theoretical understanding of bilevel learning as a regularization-parameter selection strategy and demonstrate its robustness across denoising and deconvolution applications.
Abstract
Variational regularization is commonly used to solve linear inverse problems, and involves augmenting a data fidelity by a regularizer. The regularizer is used to promote a priori information and is weighted by a regularization parameter. Selection of an appropriate regularization parameter is critical, with various choices leading to very different reconstructions. Classical strategies used to determine a suitable parameter value include the discrepancy principle and the L-curve criterion, and in recent years a supervised machine learning approach called bilevel learning has been employed. Bilevel learning is a powerful framework to determine optimal parameters and involves solving a nested optimization problem. While previous strategies enjoy various theoretical results, the well-posedness of bilevel learning in this setting is still an open question. In particular, a necessary property is positivity of the determined regularization parameter. In this work, we provide a new condition that better characterizes positivity of optimal regularization parameters than the existing theory. Numerical results verify and explore this new condition for both small and high-dimensional problems.
