Table of Contents
Fetching ...

Can Bayesian Neural Networks Explicitly Model Input Uncertainty?

Matias Valdenegro-Toro, Marco Zullich

TL;DR

The paper investigates whether Bayesian neural networks can explicitly model input uncertainty when provided as an additional input alongside the data mean $x_\mu$ and its standard deviation $x_\sigma$. It introduces two-input NNs and evaluates several approximate BNN methods—Ensembles, MC-Dropout, MC-DropConnect, Flipout, and DUQ—across Two Moons, Fashion-MNIST, and a toy regression task. The findings indicate that only some methods, notably Ensembles and Flipout, effectively propagate input uncertainty into predictive uncertainty, while others largely maintain high confidence under noisy inputs, raising calibration concerns. The work highlights the method-dependent nature of reliable input-uncertainty modeling and suggests ensembles as the most dependable option among those tested, while calling for broader evaluations and uncertainty-disentanglement approaches in future work.

Abstract

Inputs to machine learning models can have associated noise or uncertainties, but they are often ignored and not modelled. It is unknown if Bayesian Neural Networks and their approximations are able to consider uncertainty in their inputs. In this paper we build a two input Bayesian Neural Network (mean and standard deviation) and evaluate its capabilities for input uncertainty estimation across different methods like Ensembles, MC-Dropout, and Flipout. Our results indicate that only some uncertainty estimation methods for approximate Bayesian NNs can model input uncertainty, in particular Ensembles and Flipout.

Can Bayesian Neural Networks Explicitly Model Input Uncertainty?

TL;DR

The paper investigates whether Bayesian neural networks can explicitly model input uncertainty when provided as an additional input alongside the data mean and its standard deviation . It introduces two-input NNs and evaluates several approximate BNN methods—Ensembles, MC-Dropout, MC-DropConnect, Flipout, and DUQ—across Two Moons, Fashion-MNIST, and a toy regression task. The findings indicate that only some methods, notably Ensembles and Flipout, effectively propagate input uncertainty into predictive uncertainty, while others largely maintain high confidence under noisy inputs, raising calibration concerns. The work highlights the method-dependent nature of reliable input-uncertainty modeling and suggests ensembles as the most dependable option among those tested, while calling for broader evaluations and uncertainty-disentanglement approaches in future work.

Abstract

Inputs to machine learning models can have associated noise or uncertainties, but they are often ignored and not modelled. It is unknown if Bayesian Neural Networks and their approximations are able to consider uncertainty in their inputs. In this paper we build a two input Bayesian Neural Network (mean and standard deviation) and evaluate its capabilities for input uncertainty estimation across different methods like Ensembles, MC-Dropout, and Flipout. Our results indicate that only some uncertainty estimation methods for approximate Bayesian NNs can model input uncertainty, in particular Ensembles and Flipout.
Paper Structure (12 sections, 6 equations, 11 figures, 2 tables)

This paper contains 12 sections, 6 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Sample of data from the Fashion-MNIST dataset with Gaussian noise with increasing standard deviation ($\sigma$ in the figure) added. The first row ($\sigma=0.0$) represents the original, unperturbed data. Natural data are often captured by means of digital sensors, which are prone to be noisy and can sporadically fail. Training NNs which can effectively model input uncertainty, especially when the noise is anomalously high, is important in having reliable predictions, which can be discarded whenever the predictive uncertainty of the model is too high.
  • Figure 2: Comparison on the Two Moons dataset with training $\sigma = 0.2$, as the testing standard deviation is varied. Each heatmap indicates predictive entropy (low blue to high yellow) and the first column includes the training data points, With larger test standard deviation, some UQ methods do not significantly change their output uncertainty (DropConnect, Dropout, DUQ), while Flipout and Ensembles do have significant changes, indicating that they are able to model input uncertainty and propagate it to the output.
  • Figure 3: The version of the Two Moons dataset (with 1000.0 data points) used in the present work, the two colors representing the two categories. From left to right, we add an increasingly higher level of zero-mean Gaussian noise. The standard deviation is denoted by $\sigma$.
  • Figure 4: Diagram of the MLP for the Two Moons dataset. The mean and std input pass through two parallel fully-connected ("FC") layers of 10 units, whose output is concatenated. Then, two 20-units fully-connected layers and the final classification layer are applied, which produce the final output. The two 20-units layers (depicted with bold borders) are made Bayesian---depending on the specific technique used.
  • Figure 5: Diagram depicting the two-input Preact-ResNet18 used on Fashion-MNIST. The input mean and standard deviation are passed through two $7\times 7$ convolutions with 32 channels and stride 2, whose outputs are concatenated. The data is then passed sequentially through a series of residual blocks ("Preact res. block") with increasing number of channels. Some blocks operate downsampling of the spatial dimensions. A detailed depiction of the residual blocks is shown in \ref{['fig:resblock']}. Following the last residual block, global average pooling ("GAP") is applied to return a vector of size 512. This vector is passed through a fully-connected layer which produces the final output of 10 units. The last residual block (depicted with thick borders) can be rendered Bayesian by turning its convolutional layers into the corresponding Bayesian version, depending on the method used.
  • ...and 6 more figures