Attention in Diffusion Model: A Survey
Litao Hua, Fan Liu, Jie Su, Xingyu Miao, Zizhou Ouyang, Zeyu Wang, Runze Hu, Zhenyu Wen, Bing Zhai, Yang Long, Haoran Duan, Yuan Zhou
TL;DR
Attention mechanisms are foundational in diffusion models and influence both generative and discriminative tasks. This survey delivers a unified taxonomy of attention modifications that operate on different components of diffusion architectures, and maps these techniques to a broad set of unimodal and multimodal tasks. It reviews architectural innovations, performance benefits, and practical applications, and identifies limitations and underexplored directions for future work. By clarifying how attention interfaces with diffusion dynamics, the paper provides a roadmap for designing more controllable, efficient, and interpretable diffusion-based systems.
Abstract
Attention mechanisms have become a foundational component in diffusion models, significantly influencing their capacity across a wide range of generative and discriminative tasks. This paper presents a comprehensive survey of attention within diffusion models, systematically analysing its roles, design patterns, and operations across different modalities and tasks. We propose a unified taxonomy that categorises attention-related modifications into parts according to the structural components they affect, offering a clear lens through which to understand their functional diversity. In addition to reviewing architectural innovations, we examine how attention mechanisms contribute to performance improvements in diverse applications. We also identify current limitations and underexplored areas, and outline potential directions for future research. Our study provides valuable insights into the evolving landscape of diffusion models, with a particular focus on the integrative and ubiquitous role of attention.
