Table of Contents
Fetching ...

Batch, match, and patch: low-rank approximations for score-based variational inference

Chirag Modi, Diana Cai, Lawrence K. Saul

TL;DR

This work tackles the scalability of score-based variational inference in high dimensions by extending Batch, Match, and (now) Patch (BaM) to use a low-rank plus diagonal covariance, $\Sigma = \Lambda \Lambda^\top + \Psi$. A patch step based on an EM algorithm for factor analysis projects unconstrained covariance updates onto this LR+D family, yielding a method (pBaM) whose cost scales linearly with dimensionality $D$. Empirically, pBaM outperforms ADVI with LR+D and LR-only variants across synthetic Gaussian targets and real-world high-dimensional problems such as Poisson regression, LGCP, and IRT, while maintaining stability and faster convergence than full BaM in many settings. This approach enables efficient, scalable score-based variational inference for very high-dimensional Bayesian models with structured covariance needs, broadening the practical applicability of VI in complex domains.

Abstract

Black-box variational inference (BBVI) scales poorly to high-dimensional problems when it is used to estimate a multivariate Gaussian approximation with a full covariance matrix. In this paper, we extend the batch-and-match (BaM) framework for score-based BBVI to problems where it is prohibitively expensive to store such covariance matrices, let alone to estimate them. Unlike classical algorithms for BBVI, which use stochastic gradient descent to minimize the reverse Kullback-Leibler divergence, BaM uses more specialized updates to match the scores of the target density and its Gaussian approximation. We extend the updates for BaM by integrating them with a more compact parameterization of full covariance matrices. In particular, borrowing ideas from factor analysis, we add an extra step to each iteration of BaM--a patch--that projects each newly updated covariance matrix into a more efficiently parameterized family of diagonal plus low rank matrices. We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.

Batch, match, and patch: low-rank approximations for score-based variational inference

TL;DR

This work tackles the scalability of score-based variational inference in high dimensions by extending Batch, Match, and (now) Patch (BaM) to use a low-rank plus diagonal covariance, . A patch step based on an EM algorithm for factor analysis projects unconstrained covariance updates onto this LR+D family, yielding a method (pBaM) whose cost scales linearly with dimensionality . Empirically, pBaM outperforms ADVI with LR+D and LR-only variants across synthetic Gaussian targets and real-world high-dimensional problems such as Poisson regression, LGCP, and IRT, while maintaining stability and faster convergence than full BaM in many settings. This approach enables efficient, scalable score-based variational inference for very high-dimensional Bayesian models with structured covariance needs, broadening the practical applicability of VI in complex domains.

Abstract

Black-box variational inference (BBVI) scales poorly to high-dimensional problems when it is used to estimate a multivariate Gaussian approximation with a full covariance matrix. In this paper, we extend the batch-and-match (BaM) framework for score-based BBVI to problems where it is prohibitively expensive to store such covariance matrices, let alone to estimate them. Unlike classical algorithms for BBVI, which use stochastic gradient descent to minimize the reverse Kullback-Leibler divergence, BaM uses more specialized updates to match the scores of the target density and its Gaussian approximation. We extend the updates for BaM by integrating them with a more compact parameterization of full covariance matrices. In particular, borrowing ideas from factor analysis, we add an extra step to each iteration of BaM--a patch--that projects each newly updated covariance matrix into a more efficiently parameterized family of diagonal plus low rank matrices. We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.

Paper Structure

This paper contains 34 sections, 46 equations, 13 figures, 1 table, 2 algorithms.

Figures (13)

  • Figure 3.1: Scaling of pBaM algorithm with dimensions, batch size and rank. The top row shows the timing for BaM step (without evaluating the scores of the target) and the bottom row shows timing for a single EM step. All timings are in seconds. We also show linear, quadratic and cubic fits to the data-points.
  • Figure 5.1: Performance with increasing dimensions for Gaussian with low-rank ($K=32$) plus diagonal covariance as target. We monitor reverse KL divergence for pBaM (blue) and ADVI-LR (orange) with $K=32$, and ADVI-D (green) for two different $\lambda_t$ and learning rate respectively (solid vs dashed).
  • Figure 5.2: Performance with increasing ranks ($K$) for variational family. Target is 512 dimension Gaussian with low-rank ($K=256$) plus diagonal. We monitor reverse KL divergence for pBaM (solid) and ADVI-LR (dashed), and show marginal histograms for first dimension.
  • Figure 5.3: Synthetic GP-Poisson regression, $D=100$. We show the reverse KL (left) and the estimated rate using the posterior means (right).
  • Figure 5.4: Log Gaussian Cox Process, $D=811$. We plot the reverse KL (left) and the estimated rate (right).
  • ...and 8 more figures