Variational Neural Networks

Illia Oleksiienko; Dat Thanh Tran; Alexandros Iosifidis

Variational Neural Networks

Illia Oleksiienko, Dat Thanh Tran, Alexandros Iosifidis

TL;DR

A method for uncertainty estimation in neural networks which, instead of considering a distribution over weights, samples outputs of each layer from a corresponding Gaussian distribution, parametrized by the predictions of mean and variance sub-layers is proposed.

Abstract

Bayesian Neural Networks (BNNs) provide a tool to estimate the uncertainty of a neural network by considering a distribution over weights and sampling different models for each input. In this paper, we propose a method for uncertainty estimation in neural networks which, instead of considering a distribution over weights, samples outputs of each layer from a corresponding Gaussian distribution, parametrized by the predictions of mean and variance sub-layers. In uncertainty quality estimation experiments, we show that the proposed method achieves better uncertainty quality than other single-bin Bayesian Model Averaging methods, such as Monte Carlo Dropout or Bayes By Backpropagation methods.

Variational Neural Networks

TL;DR

Abstract

Paper Structure (10 sections, 13 equations, 3 figures)

This paper contains 10 sections, 13 equations, 3 figures.

Introduction
Bayesian Neural Networks
Related Works
Variational Neural Networks
Variational Layer
Output uncertainty estimation
Epistemic uncertainty
Aleatoric uncertainty
Experiments
Conclusion

Figures (3)

Figure 1: Comparison of computational graphs of (a) the proposed VNNs, and (b) BNNs. BNNs consider a distribution $P(w)$ over weights and sample different weights during each inference. VNNs consider a constant set of weights and use them to generate parameters of a Gaussian distribution for each layer, outputs of which are sampled from the corresponding distribution. Layer weights are represented by $\mu, \sigma, w$, activation functions by $\alpha, \alpha_{\mu}, \alpha_{\sigma}, \alpha_{N}$, classical layers by $L(\cdot)$, and $N(\cdot)$ represents the Gaussian distribution.
Figure 2: Comparison of mean KL value with 1 STD range for each method averaged across different experiment parameters.
Figure 3: Comparison of classification accuracy on MNIST and CIFAR-10 datasets with different model architectures.

Variational Neural Networks

TL;DR

Abstract

Variational Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (3)