Variance Control for Black Box Variational Inference Using The James-Stein Estimator
Dominic B. Dayta
TL;DR
This paper addresses instability and tuning challenges in Black Box Variational Inference (BBVI) arising from high-variance ELBO gradient estimates. It reframes BBVI updates as a multivariate estimation problem and introduces the Positive-Part James-Stein shrinkage to the gradient estimator (BBVI-JS+), achieving variance control without requiring explicit factorization of the variational family. Theoretical results show JS+ can dominate the naive gradient in mean-squared error, while practical experiments on Gaussian mixtures and benchmark datasets demonstrate stable convergence and competitive model fit relative to Rao-Blackwellized BBVI, with robust performance across varying model sizes. The approach offers a simple, black-box-compatible variance reduction that can be integrated with RMSProp and applied to large-scale Bayesian models, potentially broadening BBVI's applicability and reliability.
Abstract
Black Box Variational Inference is a promising framework in a succession of recent efforts to make Variational Inference more ``black box". However, in basic version it either fails to converge due to instability or requires some fine-tuning of the update steps prior to execution that hinder it from being completely general purpose. We propose a method for regulating its parameter updates by reframing stochastic gradient ascent as a multivariate estimation problem. We examine the properties of the James-Stein estimator as a replacement for the arithmetic mean of Monte Carlo estimates of the gradient of the evidence lower bound. The proposed method provides relatively weaker variance reduction than Rao-Blackwellization, but offers a tradeoff of being simpler and requiring no fine tuning on the part of the analyst. Performance on benchmark datasets also demonstrate a consistent performance at par or better than the Rao-Blackwellized approach in terms of model fit and time to convergence.
