Batch, match, and patch: low-rank approximations for score-based variational inference
Chirag Modi, Diana Cai, Lawrence K. Saul
TL;DR
This work tackles the scalability of score-based variational inference in high dimensions by extending Batch, Match, and (now) Patch (BaM) to use a low-rank plus diagonal covariance, $\Sigma = \Lambda \Lambda^\top + \Psi$. A patch step based on an EM algorithm for factor analysis projects unconstrained covariance updates onto this LR+D family, yielding a method (pBaM) whose cost scales linearly with dimensionality $D$. Empirically, pBaM outperforms ADVI with LR+D and LR-only variants across synthetic Gaussian targets and real-world high-dimensional problems such as Poisson regression, LGCP, and IRT, while maintaining stability and faster convergence than full BaM in many settings. This approach enables efficient, scalable score-based variational inference for very high-dimensional Bayesian models with structured covariance needs, broadening the practical applicability of VI in complex domains.
Abstract
Black-box variational inference (BBVI) scales poorly to high-dimensional problems when it is used to estimate a multivariate Gaussian approximation with a full covariance matrix. In this paper, we extend the batch-and-match (BaM) framework for score-based BBVI to problems where it is prohibitively expensive to store such covariance matrices, let alone to estimate them. Unlike classical algorithms for BBVI, which use stochastic gradient descent to minimize the reverse Kullback-Leibler divergence, BaM uses more specialized updates to match the scores of the target density and its Gaussian approximation. We extend the updates for BaM by integrating them with a more compact parameterization of full covariance matrices. In particular, borrowing ideas from factor analysis, we add an extra step to each iteration of BaM--a patch--that projects each newly updated covariance matrix into a more efficiently parameterized family of diagonal plus low rank matrices. We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.
