Table of Contents
Fetching ...

Diffusion Bridge Variational Inference for Deep Gaussian Processes

Jian Xu, Qibin Zhao, John Paisley, Delu Zeng

TL;DR

Deep Gaussian processes offer powerful hierarchical modeling but suffer from difficult posterior inference over inducing variables. The authors propose Diffusion Bridge Variational Inference (DBVI), which replaces DDVI's unconditional start with a learnable, data-conditioned initial distribution and frames the posterior as a Doob-bridged diffusion, enabling shorter, more efficient reverse-time trajectories. They implement a structured amortization using inducing inputs $\mathbf{Z}^{(l)}$ and derive a tractable training objective that combines a bridge-based KL term with a conditional score $s_{\mathrm{cond}}=s- h$ in the reverse SDE. Across regression, image classification, large-scale classification, and unsupervised reconstruction, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality, demonstrating scalable, data-efficient Bayesian inference for deep Gaussian processes.

Abstract

Deep Gaussian processes (DGPs) enable expressive hierarchical Bayesian modeling but pose substantial challenges for posterior inference, especially over inducing variables. Denoising diffusion variational inference (DDVI) addresses this by modeling the posterior as a time-reversed diffusion from a simple Gaussian prior. However, DDVI's fixed unconditional starting distribution remains far from the complex true posterior, resulting in inefficient inference trajectories and slow convergence. In this work, we propose Diffusion Bridge Variational Inference (DBVI), a principled extension of DDVI that initiates the reverse diffusion from a learnable, data-dependent initial distribution. This initialization is parameterized via an amortized neural network and progressively adapted using gradients from the ELBO objective, reducing the posterior gap and improving sample efficiency. To enable scalable amortization, we design the network to operate on the inducing inputs, which serve as structured, low-dimensional summaries of the dataset and naturally align with the inducing variables' shape. DBVI retains the mathematical elegance of DDVI, including Girsanov-based ELBOs and reverse-time SDEs,while reinterpreting the prior via a Doob-bridged diffusion process. We derive a tractable training objective under this formulation and implement DBVI for scalable inference in large-scale DGPs. Across regression, classification, and image reconstruction tasks, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality.

Diffusion Bridge Variational Inference for Deep Gaussian Processes

TL;DR

Deep Gaussian processes offer powerful hierarchical modeling but suffer from difficult posterior inference over inducing variables. The authors propose Diffusion Bridge Variational Inference (DBVI), which replaces DDVI's unconditional start with a learnable, data-conditioned initial distribution and frames the posterior as a Doob-bridged diffusion, enabling shorter, more efficient reverse-time trajectories. They implement a structured amortization using inducing inputs and derive a tractable training objective that combines a bridge-based KL term with a conditional score in the reverse SDE. Across regression, image classification, large-scale classification, and unsupervised reconstruction, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality, demonstrating scalable, data-efficient Bayesian inference for deep Gaussian processes.

Abstract

Deep Gaussian processes (DGPs) enable expressive hierarchical Bayesian modeling but pose substantial challenges for posterior inference, especially over inducing variables. Denoising diffusion variational inference (DDVI) addresses this by modeling the posterior as a time-reversed diffusion from a simple Gaussian prior. However, DDVI's fixed unconditional starting distribution remains far from the complex true posterior, resulting in inefficient inference trajectories and slow convergence. In this work, we propose Diffusion Bridge Variational Inference (DBVI), a principled extension of DDVI that initiates the reverse diffusion from a learnable, data-dependent initial distribution. This initialization is parameterized via an amortized neural network and progressively adapted using gradients from the ELBO objective, reducing the posterior gap and improving sample efficiency. To enable scalable amortization, we design the network to operate on the inducing inputs, which serve as structured, low-dimensional summaries of the dataset and naturally align with the inducing variables' shape. DBVI retains the mathematical elegance of DDVI, including Girsanov-based ELBOs and reverse-time SDEs,while reinterpreting the prior via a Doob-bridged diffusion process. We derive a tractable training objective under this formulation and implement DBVI for scalable inference in large-scale DGPs. Across regression, classification, and image reconstruction tasks, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality.

Paper Structure

This paper contains 31 sections, 7 theorems, 90 equations, 4 figures, 3 tables.

Key Result

Proposition 1

Let the initial constraint be encoded by the Doob $h$-transform with Then the forward bridge has drift with the same diffusion coefficient $g(t)$. Moreover, the reverse-time bridge SDE is Equivalently, the conditional score equals $s_{\text{cond}}(\mathbf{U}_t,t,\mathbf{U}_0)=s(\mathbf{U}_t,t,\mathbf{U}_0)-h(\mathbf{U}_t,t,\mathbf{U}_0)$.

Figures (4)

  • Figure 1: Comparison between DDVI and DBVI. (Left) DDVI starts from an unconditional Gaussian prior and runs a reverse diffusion SDE towards the posterior. (Right) DBVI starts from an input-conditioned initial distribution and uses an observation-conditioned diffusion bridge SDE, leading to shorter and more efficient inference trajectories.
  • Figure 2: DBVI amortized initialization using inducing inputs $Z^{(l)}$.
  • Figure 3: Test RMSE and test mean NLL (with one standard deviation error bars) of deep Gaussian processes with different inference methods (DDVI, IPVI, SGHMC, DSVI, and our proposed DBVI) across 10 benchmark datasets (Boston, Energy, Power, Concrete, Airline, Yacht, Qsar, Protein, Kin8nm, and Year). Each marker corresponds to a DGP with 2--5 layers. DBVI (orange) consistently achieves lower RMSE and NLL than existing baselines, demonstrating improved posterior approximation and predictive performance.
  • Figure 4: Comparison of DDVI and DBVI on the Energy dataset (Test RMSE).

Theorems & Definitions (11)

  • Proposition 1: Forward & reverse SDE under Doob's $h$-transform
  • Proposition 2: Marginal of Doob-augmented bridge process
  • Proposition 3: DBVI loss with amortized mean
  • Proposition 4: Forward & reverse SDE under Doob's $h$-transform
  • proof
  • Proposition 5: Marginal of Doob-augmented bridge process
  • proof
  • Proposition 6: DBVI loss with amortized mean
  • proof
  • Lemma 1: Reverse-time Girsanov: path term as score--matching
  • ...and 1 more