On Feynman--Kac training of partial Bayesian neural networks
Zheng Zhao, Sebastian Mair, Thomas B. Schön, Jens Sjölund
TL;DR
This work tackles training partial Bayesian neural networks (pBNNs) where only a subset of weights are stochastic and latent-variable inference is needed. It casts training as simulating a Feynman–Kac model and develops scalable sequential Monte Carlo (SMC) samplers to jointly estimate the deterministic parameters $\psi$ and the latent posterior $p(\phi|y_{1:N};\psi)$, addressing multi-modality in the latent space. Two algorithms, SGSMC and OHSMC, are proposed: SGSMC uses stochastic gradients with mini-batches, while OHSMC warm-starts from prior posteriors and updates parameters and posteriors concurrently, with extensions like Poisson estimators to reduce bias. Across synthetic, UCI, and MNIST experiments, these methods achieve state-of-the-art or competitive predictive performance relative to MAP-HMC, SWAG, and VB, demonstrating practical scalability for uncertainty quantification in pBNNs.
Abstract
Recently, partial Bayesian neural networks (pBNNs), which only consider a subset of the parameters to be stochastic, were shown to perform competitively with full Bayesian neural networks. However, pBNNs are often multi-modal in the latent variable space and thus challenging to approximate with parametric models. To address this problem, we propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model. We then describe variations of sequential Monte Carlo samplers that allow us to simultaneously estimate the parameters and the latent posterior distribution of this model at a tractable computational cost. Using various synthetic and real-world datasets we show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
