Diffusion-Based Visual Art Creation: A Survey and New Perspectives
Bingyuan Wang, Qifeng Chen, Zeyu Wang
TL;DR
This survey addresses diffusion-based visual art creation by mapping artistic goals to diffusion-method design and examining how data, tasks, and modalities shape technical solutions. It advances a two-axis framework that links artistic scenarios with diffusion-model modules, yielding a structured roadmap from artistic requirements to method design. Key contributions include a comprehensive dataset and taxonomy of AIGC techniques in visual art, a framework correlating scenario-modality-task-method, and a synthesis of frontiers, trends, and future outlooks from technical and synergistic perspectives. The work underscores the evolving collaboration between humans and AI in art, highlighting interactive systems, cross-modal alignment, and innovative architectures as pathways to richer, responsible digital artistry with broad practical impact for artists, educators, and technologists alike.
Abstract
The integration of generative AI in visual art has revolutionized not only how visual content is created but also how AI interacts with and reflects the underlying domain knowledge. This survey explores the emerging realm of diffusion-based visual art creation, examining its development from both artistic and technical perspectives. We structure the survey into three phases, data feature and framework identification, detailed analyses using a structured coding process, and open-ended prospective outlooks. Our findings reveal how artistic requirements are transformed into technical challenges and highlight the design and application of diffusion-based methods within visual art creation. We also provide insights into future directions from technical and synergistic perspectives, suggesting that the confluence of generative AI and art has shifted the creative paradigm and opened up new possibilities. By summarizing the development and trends of this emerging interdisciplinary area, we aim to shed light on the mechanisms through which AI systems emulate and possibly, enhance human capacities in artistic perception and creativity.
