Partially Stochastic Infinitely Deep Bayesian Neural Networks
Sergio Calvo-Ordonez, Matthieu Meunier, Francesco Piatti, Yuantao Shi
TL;DR
The paper tackles scalable uncertainty quantification for infinitely deep Bayesian neural networks by introducing Partially Stochastic Infinitely Deep BNNs (PSDE-BNNs). It combines Neural ODEs with Neural SDEs and introduces vertical and horizontal weight-separation schemes to inject partial stochasticity, paired with variational training and OU priors to control complexity. The authors prove expressivity guarantees, showing PSDE-BNNs are Universal Conditional Distribution Approximators under suitable conditions, and demonstrate empirical gains in accuracy, calibration, and uncertainty quantification with substantial efficiency improvements over fully stochastic models. This approach enables practical, uncertainty-aware inference in infinite-depth models and offers a flexible framework for future explorations in variance reduction and broader datasets.
Abstract
In this paper, we present Partially Stochastic Infinitely Deep Bayesian Neural Networks, a novel family of architectures that integrates partial stochasticity into the framework of infinitely deep neural networks. Our new class of architectures is designed to improve the computational efficiency of existing architectures at training and inference time. To do this, we leverage the advantages of partial stochasticity in the infinite-depth limit which include the benefits of full stochasticity e.g. robustness, uncertainty quantification, and memory efficiency, whilst improving their limitations around computational complexity. We present a variety of architectural configurations, offering flexibility in network design including different methods for weight partition. We also provide mathematical guarantees on the expressivity of our models by establishing that our network family qualifies as Universal Conditional Distribution Approximators. Lastly, empirical evaluations across multiple tasks show that our proposed architectures achieve better downstream task performance and uncertainty quantification than their counterparts while being significantly more efficient. The code can be found at \url{https://github.com/Sergio20f/part_stoch_inf_deep}
