AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models
Jiachun Pan, Jun Hao Liew, Vincent Y. F. Tan, Jiashi Feng, Hanshu Yan
TL;DR
This work tackles the challenge of customizing diffusion probabilistic models under a differentiable loss without requiring many example references. It introduces AdjointDPM, a gradient-backpropagation framework that uses the adjoint sensitivity method on the forward probability-flow ODE and a backward augmented ODE to compute gradients with respect to conditioning prompts, network weights, and initial noises, aided by a non-stiff reparameterization via exponential integration. The approach yields a universal, memory-efficient mechanism for DPM customization and demonstrates applications in guided sampling, vocabulary expansion, single-reference stylization, and security auditing, with favorable sampling quality and efficiency relative to baselines. By enabling gradient-based optimization over all diffusion parameters under arbitrary differentiable losses, AdjointDPM broadens practical capabilities for controllable generation, safety analysis, and data-efficient style transfer in diffusion models.
Abstract
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models (DPMs) with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. Since the sampling procedure of DPMs involves recursive calls to the denoising UNet, naïve gradient backpropagation requires storing the intermediate states of all iterations, resulting in extremely high memory consumption. To overcome this issue, we propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters (including conditioning signals, network weights, and initial noises) by solving another augmented ODE. To reduce numerical errors in both the forward generation and gradient backpropagation processes, we further reparameterize the probability-flow ODE and augmented ODE as simple non-stiff ODEs using exponential integration. Finally, we demonstrate the effectiveness of AdjointDPM on three interesting tasks: converting visual effects into identification text embeddings, finetuning DPMs for specific types of stylization, and optimizing initial noise to generate adversarial samples for security auditing.
