Posterior Mean Matching: Generative Modeling through Online Bayesian Inference
Sebastian Salazar, Michal Kucer, Yixin Wang, Emily Casleton, David Blei
TL;DR
Posterior Mean Matching (PMM) reframes generative modeling as online Bayesian inference, constructing augmented target and Bayesian models with conjugate priors to enable posterior-mean updates that progressively refine samples from the target $p^{*}(\mathbf{x})$. By instantiating PMM with conjugate pairs such as Normal–Normal, Dirichlet–Categorical, and Gamma–Poisson, the framework yields closed-form online updates for the posterior mean $\boldsymbol{\mu}_t$ and corresponding PMM objectives, while establishing connections to diffusion via continuous-time SDEs. The approach is demonstrated on image and text data, achieving competitive FID scores against diffusion methods and strong language-model metrics (e.g., BPC on text8), thereby offering a flexible alternative to diffusion-based generative modeling. The work further links PMM updates to SDEs and jump processes, enabling classical numerical methods (e.g., Euler–Maruyama) to be repurposed for PMM sampling and suggesting avenues for broader conjugacy and modality coverage with online Bayesian updates.
Abstract
This paper introduces posterior mean matching (PMM), a new method for generative modeling that is grounded in Bayesian inference. PMM uses conjugate pairs of distributions to model complex data of various modalities like images and text, offering a flexible alternative to existing methods like diffusion models. PMM models iteratively refine noisy approximations of the target distribution using updates from online Bayesian inference. PMM is flexible because its mechanics are based on general Bayesian models. We demonstrate this flexibility by developing specialized examples: a generative PMM model of real-valued data using the Normal-Normal model, a generative PMM model of count data using a Gamma-Poisson model, and a generative PMM model of discrete data using a Dirichlet-Categorical model. For the Normal-Normal PMM model, we establish a direct connection to diffusion models by showing that its continuous-time formulation converges to a stochastic differential equation (SDE). Additionally, for the Gamma-Poisson PMM, we derive a novel SDE driven by a Cox process, which is a significant departure from traditional Brownian motion-based generative models. PMMs achieve performance that is competitive with generative models for language modeling and image generation.
