Table of Contents
Fetching ...

Improved Depth Estimation of Bayesian Neural Networks

Bart van Erp, Bert de Vries

TL;DR

This paper proposes a discrete truncated normal distribution over the network depth to independently learn its mean and variance and improves test accuracy on the spiral data set and reduces the variance in posterior depth estimates.

Abstract

This paper proposes improvements over earlier work by Nazareth and Blei (2022) for estimating the depth of Bayesian neural networks. Here, we propose a discrete truncated normal distribution over the network depth to independently learn its mean and variance. Posterior distributions are inferred by minimizing the variational free energy, which balances the model complexity and accuracy. Our method improves test accuracy on the spiral data set and reduces the variance in posterior depth estimates.

Improved Depth Estimation of Bayesian Neural Networks

TL;DR

This paper proposes a discrete truncated normal distribution over the network depth to independently learn its mean and variance and improves test accuracy on the spiral data set and reduces the variance in posterior depth estimates.

Abstract

This paper proposes improvements over earlier work by Nazareth and Blei (2022) for estimating the depth of Bayesian neural networks. Here, we propose a discrete truncated normal distribution over the network depth to independently learn its mean and variance. Posterior distributions are inferred by minimizing the variational free energy, which balances the model complexity and accuracy. Our method improves test accuracy on the spiral data set and reduces the variance in posterior depth estimates.

Paper Structure

This paper contains 9 sections, 11 equations, 4 figures.

Figures (4)

  • Figure 1: Visualization of the non-linearity $\Omega_L$ in \ref{['eq:omega']}. Deeper models reuse parts of shallower models.
  • Figure 2: Spiral datasets for different rotation speeds $\omega$, generated according to Appendix \ref{['appendix:details:data']}.
  • Figure 3: (Left) Test accuracy on the spiral classification task for varying rotation speeds $\omega$. Solid lines represent the average accuracy over five independent runs, with shaded areas indicating one standard deviation ($\pm\sigma$). The discrete truncated normal distribution shows accuracy improvements across all rotational speeds compared to the Poisson-based model in nazaret_variational_2022. (Right) Means and standard deviations of the posterior distributions over network depth, shown for the first run, with similar trends across other runs. As expected, the variance of the Poisson-based model increases at larger depths, while the normal distribution converges to a single depth.
  • Figure 4: Log probability mass function of the (left) prior distribution over the model depth used in nazaret_variational_2022 and of the discrete truncated normal distribution used in this paper; and of the (right) initial variational posterior distributions over the model depth.