Table of Contents
Fetching ...

Generative Bayesian Inference with GANs

Yuexi Wang, Veronika Ročková

TL;DR

The paper tackles likelihood-free Bayesian inference for simulator-based models by uniting ABC with generative adversarial networks to learn implicit samplers of the posterior π(θ|X). The authors propose B-GAN, a conditional Wasserstein GAN that matches the ABC reference table to approximate the joint π(X,θ) and thereby draws iid posterior samples from π(θ|X0) with minimal extra cost after training. They augment B-GAN with two refinements—a two-step sequential proposal strategy and an adversarial variational Bayes objective—to improve locality around the observed data and tighten posterior approximations. Theoretical results establish finite-sample bounds on the total variation distance between the true and approximate posteriors, and empirical studies across LV, Boom-and-Bust, and SIR/common cold data demonstrate competitive or superior performance relative to state-of-the-art likelihood-free methods, with favorable scaling and flexibility.

Abstract

In the absence of explicit or tractable likelihoods, Bayesians often resort to approximate Bayesian computation (ABC) for inference. Our work bridges ABC with deep neural implicit samplers based on generative adversarial networks (GANs) and adversarial variational Bayes. Both ABC and GANs compare aspects of observed and fake data to simulate from posteriors and likelihoods, respectively. We develop a Bayesian GAN (B-GAN) sampler that directly targets the posterior by solving an adversarial optimization problem. B-GAN is driven by a deterministic mapping learned on the ABC reference by conditional GANs. Once the mapping has been trained, iid posterior samples are obtained by filtering noise at a negligible additional cost. We propose two post-processing local refinements using (1) data-driven proposals with importance reweighting, and (2) variational Bayes. We support our findings with frequentist-Bayesian results, showing that the typical total variation distance between the true and approximate posteriors converges to zero for certain neural network generators and discriminators. Our findings on simulated data show highly competitive performance relative to some of the most recent likelihood-free posterior simulators.

Generative Bayesian Inference with GANs

TL;DR

The paper tackles likelihood-free Bayesian inference for simulator-based models by uniting ABC with generative adversarial networks to learn implicit samplers of the posterior π(θ|X). The authors propose B-GAN, a conditional Wasserstein GAN that matches the ABC reference table to approximate the joint π(X,θ) and thereby draws iid posterior samples from π(θ|X0) with minimal extra cost after training. They augment B-GAN with two refinements—a two-step sequential proposal strategy and an adversarial variational Bayes objective—to improve locality around the observed data and tighten posterior approximations. Theoretical results establish finite-sample bounds on the total variation distance between the true and approximate posteriors, and empirical studies across LV, Boom-and-Bust, and SIR/common cold data demonstrate competitive or superior performance relative to state-of-the-art likelihood-free methods, with favorable scaling and flexibility.

Abstract

In the absence of explicit or tractable likelihoods, Bayesians often resort to approximate Bayesian computation (ABC) for inference. Our work bridges ABC with deep neural implicit samplers based on generative adversarial networks (GANs) and adversarial variational Bayes. Both ABC and GANs compare aspects of observed and fake data to simulate from posteriors and likelihoods, respectively. We develop a Bayesian GAN (B-GAN) sampler that directly targets the posterior by solving an adversarial optimization problem. B-GAN is driven by a deterministic mapping learned on the ABC reference by conditional GANs. Once the mapping has been trained, iid posterior samples are obtained by filtering noise at a negligible additional cost. We propose two post-processing local refinements using (1) data-driven proposals with importance reweighting, and (2) variational Bayes. We support our findings with frequentist-Bayesian results, showing that the typical total variation distance between the true and approximate posteriors converges to zero for certain neural network generators and discriminators. Our findings on simulated data show highly competitive performance relative to some of the most recent likelihood-free posterior simulators.
Paper Structure (34 sections, 7 theorems, 86 equations, 14 figures, 5 tables, 4 algorithms)

This paper contains 34 sections, 7 theorems, 86 equations, 14 figures, 5 tables, 4 algorithms.

Key Result

Theorem 1

Let $\widehat{{\bm{\beta}}}_T$ be as in eq:bhat where ${\mathcal{F}}=\{f: \|f\|_\infty\leq B\}$ for some $B>0$. Denote with $\mathbb E$ the expectation with respect to empirical measure on $\{(\theta_j, X_j)\}_{j=1}^T$ and $\{Z_j\}_{j=1}^T$ in the reference table. Assume that the prior satisfies where the KL neighborhood $B_n(\theta_0;\epsilon)$ is defined as Then for $T\geq Pdim({\mathcal{F}})\

Figures (14)

  • Figure 1: The approximate d posteriors given by B-GAN, SNL, SS, and W2 for the toy example. The results for $\theta_4$ are similar to $\theta_3$ and thus not shown here.
  • Figure 2: Posterior densities under the Gaussian model. The true parameter is ${\bm \theta}_0=(-0.7, -2.9, -1.0, -0.9, 0.6)'$, while the signs of $\theta_3$ and $\theta_4$ are not identifiable.
  • Figure 3: Maximum Mean Discrepancies (MMD, log scale) between the true posteriors and the approximated posteriors. The box-plots are computed from 10 repetitions.
  • Figure 4: Posterior for M/G/1-queuing under different implementations of B-GAN
  • Figure 5: Approximate posterior densities under the Lotka-Volterra Model. The true parameter vector (marked by vertical lines) is ${\bm \theta}_0=(0.01, 0.5, 1, 0.01)'$.
  • ...and 9 more figures

Theorems & Definitions (16)

  • Definition 1
  • Definition 2
  • Example 1: Toy Example
  • Example 2: Toy Example Continued
  • Theorem 1
  • Remark 3
  • Corollary 4
  • Definition 5
  • Corollary 6
  • Remark 7
  • ...and 6 more