Table of Contents
Fetching ...

AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models

Jiachun Pan, Jun Hao Liew, Vincent Y. F. Tan, Jiashi Feng, Hanshu Yan

TL;DR

This work tackles the challenge of customizing diffusion probabilistic models under a differentiable loss without requiring many example references. It introduces AdjointDPM, a gradient-backpropagation framework that uses the adjoint sensitivity method on the forward probability-flow ODE and a backward augmented ODE to compute gradients with respect to conditioning prompts, network weights, and initial noises, aided by a non-stiff reparameterization via exponential integration. The approach yields a universal, memory-efficient mechanism for DPM customization and demonstrates applications in guided sampling, vocabulary expansion, single-reference stylization, and security auditing, with favorable sampling quality and efficiency relative to baselines. By enabling gradient-based optimization over all diffusion parameters under arbitrary differentiable losses, AdjointDPM broadens practical capabilities for controllable generation, safety analysis, and data-efficient style transfer in diffusion models.

Abstract

Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models (DPMs) with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. Since the sampling procedure of DPMs involves recursive calls to the denoising UNet, naïve gradient backpropagation requires storing the intermediate states of all iterations, resulting in extremely high memory consumption. To overcome this issue, we propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters (including conditioning signals, network weights, and initial noises) by solving another augmented ODE. To reduce numerical errors in both the forward generation and gradient backpropagation processes, we further reparameterize the probability-flow ODE and augmented ODE as simple non-stiff ODEs using exponential integration. Finally, we demonstrate the effectiveness of AdjointDPM on three interesting tasks: converting visual effects into identification text embeddings, finetuning DPMs for specific types of stylization, and optimizing initial noise to generate adversarial samples for security auditing.

AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models

TL;DR

This work tackles the challenge of customizing diffusion probabilistic models under a differentiable loss without requiring many example references. It introduces AdjointDPM, a gradient-backpropagation framework that uses the adjoint sensitivity method on the forward probability-flow ODE and a backward augmented ODE to compute gradients with respect to conditioning prompts, network weights, and initial noises, aided by a non-stiff reparameterization via exponential integration. The approach yields a universal, memory-efficient mechanism for DPM customization and demonstrates applications in guided sampling, vocabulary expansion, single-reference stylization, and security auditing, with favorable sampling quality and efficiency relative to baselines. By enabling gradient-based optimization over all diffusion parameters under arbitrary differentiable losses, AdjointDPM broadens practical capabilities for controllable generation, safety analysis, and data-efficient style transfer in diffusion models.

Abstract

Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models (DPMs) with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. Since the sampling procedure of DPMs involves recursive calls to the denoising UNet, naïve gradient backpropagation requires storing the intermediate states of all iterations, resulting in extremely high memory consumption. To overcome this issue, we propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters (including conditioning signals, network weights, and initial noises) by solving another augmented ODE. To reduce numerical errors in both the forward generation and gradient backpropagation processes, we further reparameterize the probability-flow ODE and augmented ODE as simple non-stiff ODEs using exponential integration. Finally, we demonstrate the effectiveness of AdjointDPM on three interesting tasks: converting visual effects into identification text embeddings, finetuning DPMs for specific types of stylization, and optimizing initial noise to generate adversarial samples for security auditing.
Paper Structure (27 sections, 21 equations, 12 figures, 4 tables, 3 algorithms)

This paper contains 27 sections, 21 equations, 12 figures, 4 tables, 3 algorithms.

Figures (12)

  • Figure 1: Examples for Vocabulary Expansion. The original Stable Diffusion cannot generate samples whose features exactly match the ground-truth reference images. Using the FGVC model, AdjointDPM can guide the Stable Diffusion to synthesize a certain breed of animals. Here we can generate images where the dog's face closely resembles target breeds. Besides, we generate birds with features that are more similar to real images, such as black heads for Orchard Oriole and blue feathers for the Blue-Winged Warbler.
  • Figure 2: Adversarial samples against the NSFW filter. We show the image generated by conditioning on harmful prompts (e.g., "A photograph of a naked man") on the left. These images will be blocked by the NSFW filter. However, the images generated from adversarial initial noises circumvent the NSFW filter (Black squares are added by authors for publication).
  • Figure 3: Stylization examples. Images generated by the original Stable Diffusion are shown at the top. The bottom are samples of the stylized Stable Diffusion.
  • Figure 4: Examples on prompt inversion - part 1
  • Figure 5: Examples on prompt inversion - part 2
  • ...and 7 more figures