Table of Contents
Fetching ...

Robust and Scalable Variational Bayes

Carlos Misael Madrid Padilla, Shitao Fan, Lizhen Lin

TL;DR

This work develops the VM-Posterior, a robust and scalable variational Bayes framework that partitions data into groups, computes subset variational posteriors with likelihood power adjustment, and robustly aggregates them using Wasserstein-based medians. It proves contraction at rate $\epsilon_l$ with a variational gap of order $l\epsilon_l^2$, and establishes Bernstein–von Mises type normality for the aggregated posterior under both mean-field and Gaussian variational families. Theoretical guarantees are complemented by practical algorithms for variational approximation, Wasserstein medians, and likelihood-power tuning, with extensive numerical studies on Gaussian models, Gaussian mixtures, LDA, and a real penguins dataset demonstrating robustness and scalability. Together, these results yield a principled, scalable tool for robust Bayesian inference in large-scale, contaminated datasets, with broad applicability to mixture models and topic models.

Abstract

We propose a robust and scalable framework for variational Bayes (VB) that effectively handles outliers and contamination of arbitrary nature in large datasets. Our approach divides the dataset into disjoint subsets, computes the posterior for each subset, and applies VB approximation independently to these posteriors. The resulting variational posteriors with respect to the subsets are then aggregated using the geometric median of probability measures, computed with respect to the Wasserstein distance. This novel aggregation method yields the Variational Median Posterior (VM-Posterior) distribution. We rigorously demonstrate that the VM-Posterior preserves contraction properties akin to those of the true posterior, while accounting for approximation errors or the variational gap inherent in VB methods. We also provide provable robustness guarantee of the VM-Posterior. Furthermore, we establish a variational Bernstein-von Mises theorem for both multivariate Gaussian distributions with general covariance structures and the mean-field variational family. To facilitate practical implementation, we adapt existing algorithms for computing the VM-Posterior and evaluate its performance through extensive numerical experiments. The results highlight its robustness and scalability, making it a reliable tool for Bayesian inference in the presence of complex, contaminated datasets.

Robust and Scalable Variational Bayes

TL;DR

This work develops the VM-Posterior, a robust and scalable variational Bayes framework that partitions data into groups, computes subset variational posteriors with likelihood power adjustment, and robustly aggregates them using Wasserstein-based medians. It proves contraction at rate with a variational gap of order , and establishes Bernstein–von Mises type normality for the aggregated posterior under both mean-field and Gaussian variational families. Theoretical guarantees are complemented by practical algorithms for variational approximation, Wasserstein medians, and likelihood-power tuning, with extensive numerical studies on Gaussian models, Gaussian mixtures, LDA, and a real penguins dataset demonstrating robustness and scalability. Together, these results yield a principled, scalable tool for robust Bayesian inference in large-scale, contaminated datasets, with broad applicability to mixture models and topic models.

Abstract

We propose a robust and scalable framework for variational Bayes (VB) that effectively handles outliers and contamination of arbitrary nature in large datasets. Our approach divides the dataset into disjoint subsets, computes the posterior for each subset, and applies VB approximation independently to these posteriors. The resulting variational posteriors with respect to the subsets are then aggregated using the geometric median of probability measures, computed with respect to the Wasserstein distance. This novel aggregation method yields the Variational Median Posterior (VM-Posterior) distribution. We rigorously demonstrate that the VM-Posterior preserves contraction properties akin to those of the true posterior, while accounting for approximation errors or the variational gap inherent in VB methods. We also provide provable robustness guarantee of the VM-Posterior. Furthermore, we establish a variational Bernstein-von Mises theorem for both multivariate Gaussian distributions with general covariance structures and the mean-field variational family. To facilitate practical implementation, we adapt existing algorithms for computing the VM-Posterior and evaluate its performance through extensive numerical experiments. The results highlight its robustness and scalability, making it a reliable tool for Bayesian inference in the presence of complex, contaminated datasets.

Paper Structure

This paper contains 29 sections, 12 theorems, 127 equations, 6 figures, 3 algorithms.

Key Result

Proposition 1

(Variational approximation gap). Let $l > 0$. Consider $G_{j}$ with $l = \vert G_{j} \vert$, a partition of the full data $X_1, \ldots, X_n$. For any $j\in\{1,...,m\}$, we have that Further, suppose that Assumption Prior-cond holds. Then holds for any $j\in\{1,...,m\}$.

Figures (6)

  • Figure 2: Posterior coverage for different levels of significance
  • Figure 3: Posterior coverage computational cost
  • Figure 4: Variational inference for Guassian mixture with the increasing magnitude of the outlier
  • Figure 5: Posterior predictive coverage for different levels of significance
  • Figure 6: Variational inference for LDA model
  • ...and 1 more figures

Theorems & Definitions (13)

  • Proposition 1
  • Theorem 2
  • Theorem 3
  • Definition 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Corollary 8
  • Theorem 9
  • Theorem 10
  • ...and 3 more