Table of Contents
Fetching ...

Iterative Tilting for Diffusion Fine-Tuning

Jean Pachebat, Giovanni Conforti, Alain Durmus, Yazid Janati

TL;DR

This work tackles reward-driven fine-tuning of diffusion models without gradient-based backpropagation through sampling. It introduces iterative tilting, decomposing a large reward tilt into N small, tractable tilts and updating the score via a first-order Taylor expansion using only forward evaluations of the reward. The approach yields a gradient-free, scalable path toward reward-tilted distributions and is validated on a 2D Gaussian mixture where exact tilted scores are available, showing convergence of the learned score across tilts. Compared to gradient-based methods like DRaFT and adjoint-based SOC, iterative tilting trades a single gradient update for multiple cheap gradient-free updates, enabling handling of non-differentiable or black-box rewards. The results suggest practical potential for high-dimensional, reward-driven diffusion tuning, with avenues for variance reduction, N-selection criteria, and integration with efficient adapters.

Abstract

We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt $\exp(λr)$ into $N$ sequential smaller tilts, each admitting a tractable score update via first-order Taylor expansion. This requires only forward evaluations of the reward function and avoids backpropagating through sampling chains. We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distribution is available in closed form.

Iterative Tilting for Diffusion Fine-Tuning

TL;DR

This work tackles reward-driven fine-tuning of diffusion models without gradient-based backpropagation through sampling. It introduces iterative tilting, decomposing a large reward tilt into N small, tractable tilts and updating the score via a first-order Taylor expansion using only forward evaluations of the reward. The approach yields a gradient-free, scalable path toward reward-tilted distributions and is validated on a 2D Gaussian mixture where exact tilted scores are available, showing convergence of the learned score across tilts. Compared to gradient-based methods like DRaFT and adjoint-based SOC, iterative tilting trades a single gradient update for multiple cheap gradient-free updates, enabling handling of non-differentiable or black-box rewards. The results suggest practical potential for high-dimensional, reward-driven diffusion tuning, with avenues for variance reduction, N-selection criteria, and integration with efficient adapters.

Abstract

We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt into sequential smaller tilts, each admitting a tractable score update via first-order Taylor expansion. This requires only forward evaluations of the reward function and avoids backpropagating through sampling chains. We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distribution is available in closed form.

Paper Structure

This paper contains 27 sections, 2 theorems, 47 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Proposition 3.1

Assume that $r$ has at most linear growth, i.e., $\forall x\in\mathbb R^d:\;|r(x)| \leqslant C(1 + \|x\|)$ for some $C > 0$, and that $q_{t|0}$ is the Gaussian forward kernel eq:forward_kernel. Then for all $t \in (0,1]$ and $x_t \in \mathbb R^d$,

Figures (2)

  • Figure 1: Iterative tilting on the 2-D GMM. Each row shows fixed-$N$ settings (top to bottom: $N=20,50,100,200$) at iterations $N/2$ (left) and $N$ (right). The right plot in each row corresponds to the target distribution.
  • Figure 2: Evolution of the score error during iterative tilting for different $N$. We plot RMSE ($\sqrt{\mathrm{MSE}}$) versus tilt iteration for $N\in\{20,50,100,200\}$.

Theorems & Definitions (4)

  • Proposition 3.1: Score of the tilted distribution
  • proof
  • Corollary 3.2: First-order score approximation
  • proof