Diffusion Bridge Variational Inference for Deep Gaussian Processes
Jian Xu, Qibin Zhao, John Paisley, Delu Zeng
TL;DR
Deep Gaussian processes offer powerful hierarchical modeling but suffer from difficult posterior inference over inducing variables. The authors propose Diffusion Bridge Variational Inference (DBVI), which replaces DDVI's unconditional start with a learnable, data-conditioned initial distribution and frames the posterior as a Doob-bridged diffusion, enabling shorter, more efficient reverse-time trajectories. They implement a structured amortization using inducing inputs $\mathbf{Z}^{(l)}$ and derive a tractable training objective that combines a bridge-based KL term with a conditional score $s_{\mathrm{cond}}=s- h$ in the reverse SDE. Across regression, image classification, large-scale classification, and unsupervised reconstruction, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality, demonstrating scalable, data-efficient Bayesian inference for deep Gaussian processes.
Abstract
Deep Gaussian processes (DGPs) enable expressive hierarchical Bayesian modeling but pose substantial challenges for posterior inference, especially over inducing variables. Denoising diffusion variational inference (DDVI) addresses this by modeling the posterior as a time-reversed diffusion from a simple Gaussian prior. However, DDVI's fixed unconditional starting distribution remains far from the complex true posterior, resulting in inefficient inference trajectories and slow convergence. In this work, we propose Diffusion Bridge Variational Inference (DBVI), a principled extension of DDVI that initiates the reverse diffusion from a learnable, data-dependent initial distribution. This initialization is parameterized via an amortized neural network and progressively adapted using gradients from the ELBO objective, reducing the posterior gap and improving sample efficiency. To enable scalable amortization, we design the network to operate on the inducing inputs, which serve as structured, low-dimensional summaries of the dataset and naturally align with the inducing variables' shape. DBVI retains the mathematical elegance of DDVI, including Girsanov-based ELBOs and reverse-time SDEs,while reinterpreting the prior via a Doob-bridged diffusion process. We derive a tractable training objective under this formulation and implement DBVI for scalable inference in large-scale DGPs. Across regression, classification, and image reconstruction tasks, DBVI consistently outperforms DDVI and other variational baselines in predictive accuracy, convergence speed, and posterior quality.
