Conditional diffusions for amortized neural posterior estimation
Tianyu Chen, Vansh Bansal, James G. Scott
TL;DR
This work addresses amortized Bayesian posterior estimation when the likelihood is intractable by introducing conditional diffusion decoders that are conditioned on learned data summaries. The authors prove a KL-divergence upper bound for jointly trained diffusion decoders and summary networks, and demonstrate via a comprehensive benchmark that diffusion-based decoders achieve higher stability and accuracy with faster training than normalizing flows across diverse problem classes and encoder architectures. They also provide three illustrative examples highlighting diffusion models' ability to recover complex, multimodal posteriors and boundary-transitions. The results suggest conditional diffusion with learned summaries offers a robust, scalable alternative for SBI, with practical implications for complex scientific applications where likelihoods are costly or unavailable.
Abstract
Neural posterior estimation (NPE), a simulation-based computational approach for Bayesian inference, has shown great success in approximating complex posterior distributions. Existing NPE methods typically rely on normalizing flows, which approximate a distribution by composing many simple, invertible transformations. But flow-based models, while state of the art for NPE, are known to suffer from several limitations, including training instability and sharp trade-offs between representational power and computational cost. In this work, we demonstrate the effectiveness of conditional diffusions coupled with high-capacity summary networks for amortized NPE. Conditional diffusions address many of the challenges faced by flow-based methods. Our results show that, across a highly varied suite of benchmarking problems for NPE architectures, diffusions offer improved stability, superior accuracy, and faster training times, even with simpler, shallower models. Building on prior work on diffusions for NPE, we show that these gains persist across a variety of different summary network architectures. Code is available at https://github.com/TianyuCodings/cDiff.
