Table of Contents
Fetching ...

Posterior Mean Matching: Generative Modeling through Online Bayesian Inference

Sebastian Salazar, Michal Kucer, Yixin Wang, Emily Casleton, David Blei

TL;DR

Posterior Mean Matching (PMM) reframes generative modeling as online Bayesian inference, constructing augmented target and Bayesian models with conjugate priors to enable posterior-mean updates that progressively refine samples from the target $p^{*}(\mathbf{x})$. By instantiating PMM with conjugate pairs such as Normal–Normal, Dirichlet–Categorical, and Gamma–Poisson, the framework yields closed-form online updates for the posterior mean $\boldsymbol{\mu}_t$ and corresponding PMM objectives, while establishing connections to diffusion via continuous-time SDEs. The approach is demonstrated on image and text data, achieving competitive FID scores against diffusion methods and strong language-model metrics (e.g., BPC on text8), thereby offering a flexible alternative to diffusion-based generative modeling. The work further links PMM updates to SDEs and jump processes, enabling classical numerical methods (e.g., Euler–Maruyama) to be repurposed for PMM sampling and suggesting avenues for broader conjugacy and modality coverage with online Bayesian updates.

Abstract

This paper introduces posterior mean matching (PMM), a new method for generative modeling that is grounded in Bayesian inference. PMM uses conjugate pairs of distributions to model complex data of various modalities like images and text, offering a flexible alternative to existing methods like diffusion models. PMM models iteratively refine noisy approximations of the target distribution using updates from online Bayesian inference. PMM is flexible because its mechanics are based on general Bayesian models. We demonstrate this flexibility by developing specialized examples: a generative PMM model of real-valued data using the Normal-Normal model, a generative PMM model of count data using a Gamma-Poisson model, and a generative PMM model of discrete data using a Dirichlet-Categorical model. For the Normal-Normal PMM model, we establish a direct connection to diffusion models by showing that its continuous-time formulation converges to a stochastic differential equation (SDE). Additionally, for the Gamma-Poisson PMM, we derive a novel SDE driven by a Cox process, which is a significant departure from traditional Brownian motion-based generative models. PMMs achieve performance that is competitive with generative models for language modeling and image generation.

Posterior Mean Matching: Generative Modeling through Online Bayesian Inference

TL;DR

Posterior Mean Matching (PMM) reframes generative modeling as online Bayesian inference, constructing augmented target and Bayesian models with conjugate priors to enable posterior-mean updates that progressively refine samples from the target . By instantiating PMM with conjugate pairs such as Normal–Normal, Dirichlet–Categorical, and Gamma–Poisson, the framework yields closed-form online updates for the posterior mean and corresponding PMM objectives, while establishing connections to diffusion via continuous-time SDEs. The approach is demonstrated on image and text data, achieving competitive FID scores against diffusion methods and strong language-model metrics (e.g., BPC on text8), thereby offering a flexible alternative to diffusion-based generative modeling. The work further links PMM updates to SDEs and jump processes, enabling classical numerical methods (e.g., Euler–Maruyama) to be repurposed for PMM sampling and suggesting avenues for broader conjugacy and modality coverage with online Bayesian updates.

Abstract

This paper introduces posterior mean matching (PMM), a new method for generative modeling that is grounded in Bayesian inference. PMM uses conjugate pairs of distributions to model complex data of various modalities like images and text, offering a flexible alternative to existing methods like diffusion models. PMM models iteratively refine noisy approximations of the target distribution using updates from online Bayesian inference. PMM is flexible because its mechanics are based on general Bayesian models. We demonstrate this flexibility by developing specialized examples: a generative PMM model of real-valued data using the Normal-Normal model, a generative PMM model of count data using a Gamma-Poisson model, and a generative PMM model of discrete data using a Dirichlet-Categorical model. For the Normal-Normal PMM model, we establish a direct connection to diffusion models by showing that its continuous-time formulation converges to a stochastic differential equation (SDE). Additionally, for the Gamma-Poisson PMM, we derive a novel SDE driven by a Cox process, which is a significant departure from traditional Brownian motion-based generative models. PMMs achieve performance that is competitive with generative models for language modeling and image generation.

Paper Structure

This paper contains 60 sections, 10 theorems, 86 equations, 7 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

(Concentration of posterior mean) Let $\{ \pmb{y}_1, \ldots, \pmb{y}_t \}$ be observations generated according to equation (eqn:norm-norm-pmm-marginal). Suppose $\alpha_t$ a known, positive, increasing sequence satisfying $\lim_{t \to \infty} \alpha_t = \infty$. Then, the posterior mean $\pmb{\mu}_t with respect to the joint distribution of $(\pmb{x}, \pmb{y}_1, \pmb{y}_2, \ldots)$ in equation (eq

Figures (7)

  • Figure 1: Diagram of the online Bayesian inference update process. At each time step $t$, an observation $\pmb{y}_t$ is incorporated to update the posterior mean $\pmb{\mu}_t$. The ellipsis ($\cdots$) indicates the iterative nature of the updates, starting from the prior mean $\pmb{\mu}_{0}$.
  • Figure 2: Convergence of the posterior mean trajectories $\pmb{\mu}_t$ to samples from the target $\pmb{x} \sim p^*(\pmb{x})$ as $t$ increases for the Normal Posterior Mean Matching (PMM) model. Refer to Figure \ref{['fig:pmean_convergence_app']} in the Appendix \ref{['app:figures']} for a more detailed view.
  • Figure 3: Comparison of sample generations for the Normal-Normal PMM model across CIFAR10 and FFHQ datasets. See Appendix \ref{['app:figures']} for a larger sample of generated images.
  • Figure :
  • Figure C.1: Convergence of the posterior mean $\pmb{\mu}_t$ to target samples $\pmb{x} \sim p^*(\pmb{x})$ as $t$ increases for the Normal-Normal Posterior Mean Matching (PMM) model.
  • ...and 2 more figures

Theorems & Definitions (17)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem A.1
  • proof
  • Theorem A.2
  • proof
  • Theorem A.3
  • proof
  • Theorem A.4
  • ...and 7 more