Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling
Jian Xu, Zhiqi Lin, Shigui Li, Min Chen, Junmei Yang, Delu Zeng, John Paisley
TL;DR
This work targets the limited expressiveness of Gaussian priors in Bayesian Last Layer models by introducing implicit weight priors parameterized by a neural generator and performing posterior sampling over auxiliary variables with diffusion models. It derives a tractable variational lower bound for the marginal likelihood using a time-reversal diffusion SDE, score matching with a reference process, and KL-based optimization, enabling end-to-end training of the DVI-IBLL method. Empirical results on regression (UCI) and image classification (CIFAR-10/100) demonstrate improved predictive uncertainty (NLL, calibration) and competitive accuracy and OOD detection, with only moderate runtime overhead. The proposed approach advances uncertainty quantification in neural networks by marrying flexible priors, diffusion-based posterior inference, and scalable optimization for challenging data distributions. Overall, it broadens the applicability of Bayesian Last Layer models to non-Gaussian, outlier-prone, and high-dimensional settings while maintaining efficiency.
Abstract
Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall, we introduce a novel approach that combines diffusion techniques and implicit priors for variational learning of Bayesian last layer weights. This method leverages implicit distributions for modeling weight priors in BLL, coupled with diffusion samplers for approximating true posterior predictions, thereby establishing a comprehensive Bayesian prior and posterior estimation strategy. By delivering an explicit and computationally efficient variational lower bound, our method aims to augment the expressive abilities of BLL models, enhancing model accuracy, calibration, and out-of-distribution detection proficiency. Through detailed exploration and experimental validation, We showcase the method's potential for improving predictive accuracy and uncertainty quantification while ensuring computational efficiency.
