Restricted Bayesian Neural Network
Sourav Ganguly, Saprativa Bhattacharjee
TL;DR
This work tackles uncertainty and memory efficiency in deep learning by introducing a Restricted Bayesian Neural Network (RBNN) that represents incoming weights for each neuron as Gaussian-distributed parameters, allowing storage of only distribution parameters rather than full weight matrices. Training is performed via Cross Entropy Optimization (CEO), a zero-order, gradient-free method that samples weight realizations from $\mathcal{N}(v,\Sigma)$, selects elite samples, and updates the Gaussian parameters to drive down the loss while keeping weights bounded in $[-1,1]$. The authors demonstrate competitive accuracy on Pulsar-star and Iris datasets, with notable storage advantages over standard FFNNs and traditional BNNs, and show rapid convergence within a small number of epochs. Overall, the approach offers a practical path to uncertainty-aware models that scale memory-wise to larger architectures, with clear avenues for regression and time-series extensions in future work.
Abstract
Modern deep learning tools are remarkably effective in addressing intricate problems. However, their operation as black-box models introduces increased uncertainty in predictions. Additionally, they contend with various challenges, including the need for substantial storage space in large networks, issues of overfitting, underfitting, vanishing gradients, and more. This study explores the concept of Bayesian Neural Networks, presenting a novel architecture designed to significantly alleviate the storage space complexity of a network. Furthermore, we introduce an algorithm adept at efficiently handling uncertainties, ensuring robust convergence values without becoming trapped in local optima, particularly when the objective function lacks perfect convexity.
