Table of Contents
Fetching ...

A Framework for Variational Inference of Lightweight Bayesian Neural Networks with Heteroscedastic Uncertainties

David J. Schodt, Ryan Brown, Michael Merritt, Samuel Park, Delsin Menolascino, Mark A. Peot

TL;DR

The paper addresses uncertainty quantification in lightweight Bayesian neural networks by embedding total predictive uncertainty into the variances of BNN parameters rather than relying on an additional aleatoric-output head. It develops a sampling-free variational inference framework built on moment propagation, where the per-point variance satisfies $\sigma_k^2 = \sigma_{a,k}^2 + \sigma_{e,k}^2$, and layer-wise mean/variance updates are derived for FC, Conv, pooling, and Leaky-ReLU layers. This approach avoids increasing model size while capturing both epistemic and aleatoric uncertainties, demonstrated on a heteroscedastic polynomial regression task where embedded variance improves out-of-distribution reliability and can outperform learned-variance in lightweight regimes. The results highlight a practical path for deploying uncertainty-aware BNNs on resource-constrained devices, combining accuracy with efficient, sampling-free inference.

Abstract

Obtaining heteroscedastic predictive uncertainties from a Bayesian Neural Network (BNN) is vital to many applications. Often, heteroscedastic aleatoric uncertainties are learned as outputs of the BNN in addition to the predictive means, however doing so may necessitate adding more learnable parameters to the network. In this work, we demonstrate that both the heteroscedastic aleatoric and epistemic variance can be embedded into the variances of learned BNN parameters, improving predictive performance for lightweight networks. By complementing this approach with a moment propagation approach to inference, we introduce a relatively simple framework for sampling-free variational inference suitable for lightweight BNNs.

A Framework for Variational Inference of Lightweight Bayesian Neural Networks with Heteroscedastic Uncertainties

TL;DR

The paper addresses uncertainty quantification in lightweight Bayesian neural networks by embedding total predictive uncertainty into the variances of BNN parameters rather than relying on an additional aleatoric-output head. It develops a sampling-free variational inference framework built on moment propagation, where the per-point variance satisfies , and layer-wise mean/variance updates are derived for FC, Conv, pooling, and Leaky-ReLU layers. This approach avoids increasing model size while capturing both epistemic and aleatoric uncertainties, demonstrated on a heteroscedastic polynomial regression task where embedded variance improves out-of-distribution reliability and can outperform learned-variance in lightweight regimes. The results highlight a practical path for deploying uncertainty-aware BNNs on resource-constrained devices, combining accuracy with efficient, sampling-free inference.

Abstract

Obtaining heteroscedastic predictive uncertainties from a Bayesian Neural Network (BNN) is vital to many applications. Often, heteroscedastic aleatoric uncertainties are learned as outputs of the BNN in addition to the predictive means, however doing so may necessitate adding more learnable parameters to the network. In this work, we demonstrate that both the heteroscedastic aleatoric and epistemic variance can be embedded into the variances of learned BNN parameters, improving predictive performance for lightweight networks. By complementing this approach with a moment propagation approach to inference, we introduce a relatively simple framework for sampling-free variational inference suitable for lightweight BNNs.
Paper Structure (21 sections, 15 equations, 2 figures)

This paper contains 21 sections, 15 equations, 2 figures.

Figures (2)

  • Figure 1: BNN architecture comparison and performance. (a) Architecture diagrams showing the BNN architecture used with the "embedded variance" approach (left) versus the "learned variance" approach (right) for recovering heteroscedastic variance. Each BNN contains a single hidden layer with leaky-ReLU activation functions. Notably, for an equal number of neurons in the hidden layer, the "learned variance" consists of a greater number of learnable parameters. (b) Reconstruction loss (expected negative log-likelihood) versus the number of neurons in the hidden layer of each BNN. Results for the BNN trained with the "embedded variance" approach (shown in (a) on the left) are indicated by blue filled circles. Results for the BNN trained with the "learned variance" approach (shown in (a) on the right) are indicated by orange filled squares.
  • Figure 2: Heteroscedastic variance recovered by BNN. BNN predictions with uncertainties when predicting the data $y = x + \epsilon(x) + 1$ with heteroscedastic noise $\epsilon(x)$. (a) Predictions from the BNNs trained with the "embedded variance" approach (this work), where the combined aleatoric and epistemic variances are embedded into the variances of the trainable parameters of the BNN. (b) Predictions from the BNNs trained with the "learned variance" approach, where the aleatoric variance is predicted as an output of the BNN. (c) Reconstruction losses on data sampled from the support of the training distribution (in the range $[-0.5, 0.5]$) and data sampled from outside of this support. The dashed black lines indicate the boundaries of the training distribution range $[-0.5, 0.5]$. The reconstruction losses indicated by blue circles correspond to the models shown in (a), while the losses indicated by orange squares correspond to the models shown in (b).