Aligning Diffusion Models by Optimizing Human Utility
Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozuka
TL;DR
<3-5 sentence high-level summary>Diffusion-KTO introduces a reward-model-free framework for aligning text-to-image diffusion models by optimizing a Kahneman-Tversky–style utility over per-step, per-image actions using binary feedback (like/dislike). By extending the utility maximization paradigm to diffusion processes, it avoids collecting pairwise preferences and backpropagating through the full denoising trajectory. Empirically, Diffusion-KTO outperforms supervised fine-tuning, Diffusion-DPO, and other baselines across human judgments and automated metrics (PickScore, ImageReward, LAION aesthetics, CLIP) on multiple datasets, demonstrating robust improvements in image fidelity and prompt adherence. The work also shows the method’s flexibility, including synthetic alignment to specific user preferences and generalization to different SD variants, while acknowledging dataset biases and safety limitations as important considerations for real-world deployment.
Abstract
We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each generation independently, Diffusion-KTO does not require collecting costly pairwise preference data nor training a complex reward model. Instead, our objective requires simple per-image binary feedback signals, e.g. likes or dislikes, which are abundantly available. After fine-tuning using Diffusion-KTO, text-to-image diffusion models exhibit superior performance compared to existing techniques, including supervised fine-tuning and Diffusion-DPO, both in terms of human judgment and automatic evaluation metrics such as PickScore and ImageReward. Overall, Diffusion-KTO unlocks the potential of leveraging readily available per-image binary signals and broadens the applicability of aligning text-to-image diffusion models with human preferences.
