Variational Bayesian Last Layers
James Harrison, John Willes, Jasper Snoek
TL;DR
This work addresses reliable uncertainty estimation in deep networks with minimal overhead by proposing Variational Bayesian Last Layers (VBLLs), a sampling-free, last-layer Bayesian approach that yields a tractable, deterministic lower bound on the marginal likelihood. By formulating ELBOs for regression, discriminative, and generative classification, and offering training variants that jointly optimize the last layer or operate post hoc with frozen features, VBLLs enable scalable uncertainty quantification with near-quadratic complexity in the last-layer width. Empirical results across regression, image classification, sentiment analysis with LLM features, and contextual bandits demonstrate improved predictive accuracy, calibration, and out-of-distribution detection relative to strong baselines, while preserving compatibility with standard architectures. The work also provides practical guidance on hyperparameters, prediction strategies, and potential extensions, including combining VBLL with variational feature learning for collapsed VI.”
Abstract
We introduce a deterministic variational formulation for training Bayesian last layer neural networks. This yields a sampling-free, single-pass model and loss that effectively improves uncertainty estimation. Our variational Bayesian last layer (VBLL) can be trained and evaluated with only quadratic complexity in last layer width, and is thus (nearly) computationally free to add to standard architectures. We experimentally investigate VBLLs, and show that they improve predictive accuracy, calibration, and out of distribution detection over baselines across both regression and classification. Finally, we investigate combining VBLL layers with variational Bayesian feature learning, yielding a lower variance collapsed variational inference method for Bayesian neural networks.
