Table of Contents
Fetching ...

Restricted Bayesian Neural Network

Sourav Ganguly, Saprativa Bhattacharjee

TL;DR

This work tackles uncertainty and memory efficiency in deep learning by introducing a Restricted Bayesian Neural Network (RBNN) that represents incoming weights for each neuron as Gaussian-distributed parameters, allowing storage of only distribution parameters rather than full weight matrices. Training is performed via Cross Entropy Optimization (CEO), a zero-order, gradient-free method that samples weight realizations from $\mathcal{N}(v,\Sigma)$, selects elite samples, and updates the Gaussian parameters to drive down the loss while keeping weights bounded in $[-1,1]$. The authors demonstrate competitive accuracy on Pulsar-star and Iris datasets, with notable storage advantages over standard FFNNs and traditional BNNs, and show rapid convergence within a small number of epochs. Overall, the approach offers a practical path to uncertainty-aware models that scale memory-wise to larger architectures, with clear avenues for regression and time-series extensions in future work.

Abstract

Modern deep learning tools are remarkably effective in addressing intricate problems. However, their operation as black-box models introduces increased uncertainty in predictions. Additionally, they contend with various challenges, including the need for substantial storage space in large networks, issues of overfitting, underfitting, vanishing gradients, and more. This study explores the concept of Bayesian Neural Networks, presenting a novel architecture designed to significantly alleviate the storage space complexity of a network. Furthermore, we introduce an algorithm adept at efficiently handling uncertainties, ensuring robust convergence values without becoming trapped in local optima, particularly when the objective function lacks perfect convexity.

Restricted Bayesian Neural Network

TL;DR

This work tackles uncertainty and memory efficiency in deep learning by introducing a Restricted Bayesian Neural Network (RBNN) that represents incoming weights for each neuron as Gaussian-distributed parameters, allowing storage of only distribution parameters rather than full weight matrices. Training is performed via Cross Entropy Optimization (CEO), a zero-order, gradient-free method that samples weight realizations from , selects elite samples, and updates the Gaussian parameters to drive down the loss while keeping weights bounded in . The authors demonstrate competitive accuracy on Pulsar-star and Iris datasets, with notable storage advantages over standard FFNNs and traditional BNNs, and show rapid convergence within a small number of epochs. Overall, the approach offers a practical path to uncertainty-aware models that scale memory-wise to larger architectures, with clear avenues for regression and time-series extensions in future work.

Abstract

Modern deep learning tools are remarkably effective in addressing intricate problems. However, their operation as black-box models introduces increased uncertainty in predictions. Additionally, they contend with various challenges, including the need for substantial storage space in large networks, issues of overfitting, underfitting, vanishing gradients, and more. This study explores the concept of Bayesian Neural Networks, presenting a novel architecture designed to significantly alleviate the storage space complexity of a network. Furthermore, we introduce an algorithm adept at efficiently handling uncertainties, ensuring robust convergence values without becoming trapped in local optima, particularly when the objective function lacks perfect convexity.
Paper Structure (16 sections, 6 equations, 4 figures, 3 tables, 5 algorithms)

This paper contains 16 sections, 6 equations, 4 figures, 3 tables, 5 algorithms.

Figures (4)

  • Figure 1: A simple neural network
  • Figure 2: Depiction of outgoing edges from input layer
  • Figure 3: Error plots on Pulsar star dataset
  • Figure 4: Error plots on IRIS dataset