From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten, Derya Soydaner
TL;DR
The paper surveys how deep neural networks enable AI-generated art, tracing from early CNN visualizations to modern diffusion- and transformer-based text-to-image systems. It catalogs core building blocks (CNNs, autoencoders, GANs, Transformers, diffusion models) and highlights milestones such as DeepDream, DALL-E 3, Stable Diffusion, and Make-A-Scene. It also compares model capabilities, limitations, and accessibility, and discusses ethical concerns around deepfakes, copyright, and open-source governance. The work underscores the rapid maturation of AI art tools and their implications for authorship, aesthetics, and policy.
Abstract
This paper delves into the fascinating field of AI-generated art and explores the various deep neural network architectures and models that have been utilized to create it. From the classic convolutional networks to the cutting-edge diffusion models, we examine the key players in the field. We explain the general structures and working principles of these neural networks. Then, we showcase examples of milestones, starting with the dreamy landscapes of DeepDream and moving on to the most recent developments, including Stable Diffusion and DALL-E 3, which produce mesmerizing images. We provide a detailed comparison of these models, highlighting their strengths and limitations, and examining the remarkable progress that deep neural networks have made so far in a short period of time. With a unique blend of technical explanations and insights into the current state of AI-generated art, this paper exemplifies how art and computer science interact.
