Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach
Zhengyi Guo, Wenpin Tang, Renyuan Xu
TL;DR
This work presents a principled approach to hard-constraint diffusion generation by casting conditioning as a Doob's $h$-transform and deriving guided dynamics with a drift term $\overline{g}(t)^2 \nabla \log h(t,Y_t)$ that preserves the pretrained score network. It introduces two off-policy learning algorithms, CDG-ML and CDG-MCL, to estimate $h$ and $\nabla h$ from trajectories of the pretrained model, and provides non-asymptotic total-variation and Wasserstein guarantees for the conditional sampler in terms of the base model error and guidance estimation error. The authors establish convergence results for learning $h$ and its gradient, and demonstrate through synthetic and real-stress experiments that the framework can enforce hard constraints and generate rare-event samples efficiently, with extensions to ODE-based sampling and reinforced conditioning. The work advances reliable constrained diffusion generation with theoretical guarantees and practical applicability to safety-critical and stress-testing scenarios.
Abstract
We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one. Such constraints arise naturally in safety-critical applications and in rare-event simulation, where soft or reward-based guidance methods offer no guarantee of constraint satisfaction. Building on a probabilistic interpretation of diffusion models, we develop a principled conditional diffusion guidance framework based on Doob's h-transform, martingale representation and quadratic variation process. Specifically, the resulting guided dynamics augment a pretrained diffusion with an explicit drift correction involving the logarithmic gradient of a conditioning function, without modifying the pretrained score network. Leveraging martingale and quadratic-variation identities, we propose two novel off-policy learning algorithms based on a martingale loss and a martingale-covariation loss to estimate h and its gradient using only trajectories from the pretrained model. We provide non-asymptotic guarantees for the resulting conditional sampler in both total variation and Wasserstein distances, explicitly characterizing the impact of score approximation and guidance estimation errors. Numerical experiments demonstrate the effectiveness of the proposed methods in enforcing hard constraints and generating rare-event samples.
