A Unified and Fast-Sampling Diffusion Bridge Framework via Stochastic Optimal Control
Mokai Pan, Kaizhen Zhu, Yuexin Ma, Yanwei Fu, Jingyi Yu, Jingya Wang, Ye Shi
TL;DR
UniDB reframes diffusion-bridge models as stochastic optimal control with a tunable terminal penalty $\gamma$, unifying prior Doob-based approaches and enabling a principled trade-off between control cost and endpoint accuracy. It introduces UniDB-GOU as a concrete instantiation and UniDB++ to achieve fast, training-free reverse-time sampling via exact reverse-SDE solutions plus a data-prediction model and an SDE-Corrector. The framework yields state-of-the-art performance on image restoration tasks while achieving 5x–20x inference speedups, bridging theory and practice for diffusion bridges. This work broadens the applicability of diffusion bridges to real-time applications and diverse inverse problems, including potential medical-imaging use cases.
Abstract
Recent advances in diffusion bridge models leverage Doob's $h$-transform to establish fixed endpoints between distributions, demonstrating promising results in image translation and restoration tasks. However, these approaches often produce blurred or excessively smoothed image details and lack a comprehensive theoretical foundation to explain these shortcomings. To address these limitations, we propose UniDB, a unified and fast-sampling framework for diffusion bridges based on Stochastic Optimal Control (SOC). We reformulate the problem through an SOC-based optimization, proving that existing diffusion bridges employing Doob's $h$-transform constitute a special case, emerging when the terminal penalty coefficient in the SOC cost function tends to infinity. By incorporating a tunable terminal penalty coefficient, UniDB achieves an optimal balance between control costs and terminal penalties, substantially improving detail preservation and output quality. To avoid computationally expensive costs of iterative Euler sampling methods in UniDB, we design a training-free accelerated algorithm by deriving exact closed-form solutions for UniDB's reverse-time SDE. It is further complemented by replacing conventional noise prediction with a more stable data prediction model, along with an SDE-Corrector mechanism that maintains perceptual quality for low-step regimes, effectively reducing error accumulation. Extensive experiments across diverse image restoration tasks validate the superiority and adaptability of the proposed framework, bridging the gap between theoretical generality and practical efficiency. Our code is available online https://github.com/2769433owo/UniDB-plusplus.
