General Uncertainty Estimation with Delta Variances
Simon Schmitt, John Shawe-Taylor, Hado van Hasselt
TL;DR
The paper tackles uncertainty introduced by limited data in large neural networks by introducing Delta Variances, a gradient-based, architecture-free framework for epistemic uncertainty. Delta Variances approximate the posterior or leave-one-out variance of quantities of interest using a simple quadratic form Delta_u(z)^T Sigma Delta_u(z), where Delta_u(z) is the gradient of the QoI with respect to parameters and Sigma is a covariance surrogate (often (1/N)F^{-1}). The authors connect Bayesian, frequentist, adversarial, and out-of-distribution perspectives, provide theoretical motivations, and show that special cases recover known methods such as the Delta Method and Laplace approximation. Empirically, they validate Delta Variances on the GraphCast weather forecasting system, achieving competitive uncertainty estimates with substantially lower inference cost than ensembling, and demonstrate extensions to implicit QoIs and learned Sigma. The work offers a practical, scalable approach to quantifying epistemic uncertainty in complex predictive systems and highlights its adaptability to a range of QoIs and iterative algorithms.
Abstract
Decision makers may suffer from uncertainty induced by limited data. This may be mitigated by accounting for epistemic uncertainty, which is however challenging to estimate efficiently for large neural networks. To this extent we investigate Delta Variances, a family of algorithms for epistemic uncertainty quantification, that is computationally efficient and convenient to implement. It can be applied to neural networks and more general functions composed of neural networks. As an example we consider a weather simulator with a neural-network-based step function inside -- here Delta Variances empirically obtain competitive results at the cost of a single gradient computation. The approach is convenient as it requires no changes to the neural network architecture or training procedure. We discuss multiple ways to derive Delta Variances theoretically noting that special cases recover popular techniques and present a unified perspective on multiple related methods. Finally we observe that this general perspective gives rise to a natural extension and empirically show its benefit.
