Practical and Asymptotically Exact Conditional Sampling in Diffusion Models
Luhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, John P. Cunningham
TL;DR
The paper introduces Twisted Diffusion Sampler (TDS), a practical SMC-based method to draw asymptotically exact samples from p_theta(x^0 | y) for unconditional diffusion models, avoiding task-specific conditional training. TDS uses twisting to incorporate conditioning through tractable approximations derived from denoising predictions, while maintaining theoretical guarantees of convergence as particle count grows. The authors demonstrate TDS across 2D toy problems, MNIST/class-conditional generation, and 3D protein motif-scaffolding tasks on FrameDiff, showing improved accuracy and flexibility over heuristic or purely conditional-training approaches. The work highlights the method's ability to handle inpainting, additional degrees of freedom, and Riemannian manifolds, offering a versatile framework for exact conditional diffusion in diverse domains. Limitations include computational cost and sensitivity to the quality of twisting functions, with future work aimed at improving efficiency and expanding conditioning capabilities.
Abstract
Diffusion models have been successful on a range of conditional generation tasks including molecular design and text-to-image generation. However, these achievements have primarily depended on task-specific conditional training or error-prone heuristic approximations. Ideally, a conditional generation method should provide exact samples for a broad range of conditional distributions without requiring task-specific training. To this end, we introduce the Twisted Diffusion Sampler, or TDS. TDS is a sequential Monte Carlo (SMC) algorithm that targets the conditional distributions of diffusion models through simulating a set of weighted particles. The main idea is to use twisting, an SMC technique that enjoys good computational efficiency, to incorporate heuristic approximations without compromising asymptotic exactness. We first find in simulation and in conditional image generation tasks that TDS provides a computational statistical trade-off, yielding more accurate approximations with many particles but with empirical improvements over heuristics with as few as two particles. We then turn to motif-scaffolding, a core task in protein design, using a TDS extension to Riemannian diffusion models. On benchmark test cases, TDS allows flexible conditioning criteria and often outperforms the state of the art.
