Variational Bayesian Pseudo-Coreset
Hyungi Lee, Seungyoo Lee, Juho Lee
TL;DR
Variational Bayesian Pseudo-Coreset (VBPC) tackles the high computational burden of posterior estimation in Bayesian neural networks by learning a small pseudo-coreset and performing last-layer variational inference. By using a Gaussian likelihood for the coreset inner problem, VBPC derives closed-form updates for the coreset parameters and avoids stop-gradient or weight-space sampling, enabling memory-efficient Bayesian model averaging. The framework employs a bilevel optimization with a model pool to train pseudo-coresets that generalize across feature maps, achieving strong ACC and notably improved NLL on several benchmark datasets and robustness under distribution shifts. Overall, VBPC reduces both memory and computation while preserving or enhancing uncertainty estimates, offering a practical approach for scalable Bayesian inference in deep learning.
Abstract
The success of deep learning requires large datasets and extensive training, which can create significant computational challenges. To address these challenges, pseudo-coresets, small learnable datasets that mimic the entire data, have been proposed. Bayesian Neural Networks, which offer predictive uncertainty and probabilistic interpretation for deep neural networks, also face issues with large-scale datasets due to their high-dimensional parameter space. Prior works on Bayesian Pseudo-Coresets (BPC) attempt to reduce the computational load for computing weight posterior distribution by a small number of pseudo-coresets but suffer from memory inefficiency during BPC training and sub-optimal results. To overcome these limitations, we propose Variational Bayesian Pseudo-Coreset (VBPC), a novel approach that utilizes variational inference to efficiently approximate the posterior distribution, reducing memory usage and computational costs while improving performance across benchmark datasets.
