Table of Contents
Fetching ...

Uncertainty Quantification via Stable Distribution Propagation

Felix Petersen, Aashwin Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin

TL;DR

This work tackles uncertainty quantification in neural networks by proposing Stable Distribution Propagation (SDP), a sampling-free method that propagates Gaussian and Cauchy input uncertainties through networks. SDP uses exact affine propagation and a total-variation-optimal local linearization for nonlinearities (notably ReLU) to deliver tractable, non-marginal distribution updates; it also extends to joint input/output uncertainty via Probabilistic Neural Networks. The authors demonstrate SDP's advantages over marginal moment matching and DVIA in both accuracy (TV/Wasserstein metrics) and computation, and show its utility for calibrated prediction intervals and out-of-distribution selective prediction, including applications on UCI datasets and MNIST/EMNIST with large networks. The method is compatible with pre-trained models and scalable to deep architectures, offering a practical tool for reliable uncertainty quantification in safety-critical settings.

Abstract

We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the utility of propagating distributions, we apply the proposed method to predicting calibrated confidence intervals and selective prediction on out-of-distribution data. The results demonstrate a broad applicability of propagating distributions and show the advantages of our method over other approaches such as moment matching.

Uncertainty Quantification via Stable Distribution Propagation

TL;DR

This work tackles uncertainty quantification in neural networks by proposing Stable Distribution Propagation (SDP), a sampling-free method that propagates Gaussian and Cauchy input uncertainties through networks. SDP uses exact affine propagation and a total-variation-optimal local linearization for nonlinearities (notably ReLU) to deliver tractable, non-marginal distribution updates; it also extends to joint input/output uncertainty via Probabilistic Neural Networks. The authors demonstrate SDP's advantages over marginal moment matching and DVIA in both accuracy (TV/Wasserstein metrics) and computation, and show its utility for calibrated prediction intervals and out-of-distribution selective prediction, including applications on UCI datasets and MNIST/EMNIST with large networks. The method is compatible with pre-trained models and scalable to deep architectures, offering a practical tool for reliable uncertainty quantification in safety-critical settings.

Abstract

We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the utility of propagating distributions, we apply the proposed method to predicting calibrated confidence intervals and selective prediction on out-of-distribution data. The results demonstrate a broad applicability of propagating distributions and show the advantages of our method over other approaches such as moment matching.
Paper Structure (21 sections, 5 theorems, 15 equations, 4 figures, 22 tables)

This paper contains 21 sections, 5 theorems, 15 equations, 4 figures, 22 tables.

Key Result

Theorem 1

Local linearization provides the optimal Gaussian approximation of a univariate Gaussian distribution transformed by a ReLU non-linearity with respect to the total variation:

Figures (4)

  • Figure 1: Data (left) and model (right) uncertainty estimation. Yellow / blue indicate high / low uncertainty.
  • Figure 2: Visualization of two sample parametric approximations of ReLU for moment matching ((a) and (b), orange, dashed) as well as for the proposed Stable Distribution Propagation ((c) and (d), red, dashed). The gray (dotted) distribution shows the input and the blue (solid) is the true output distribution (best viewed in color).
  • Figure 3: Plot of the log. of the $W_1$ distance between the true distribution ReLU($X$) (i.e., the distribution after a single ReLU) and SDP (blue) as well as marginal moment matching (orange).
  • Figure 4: Selective prediction on MNIST with EMNIST letters as OOD data, evaluated on (a) pretrained off-the-shelf models as well as for (b) models trained with uncertainty propagation. Left: risk-coverage plots for off-the-shelf models trained with softmax cross-entropy. Right: models trained with uncertainty propagation. The grey line indicates perfect prediction. Results averaged over $10$ runs.

Theorems & Definitions (6)

  • Theorem 1
  • Corollary 2
  • Theorem 1
  • proof
  • Corollary 2
  • Corollary 3