Pencils to Pixels: A Systematic Study of Creative Drawings across Children, Adults and AI
Surabhi S Nath, Guiomar del Cuvillo y Schröder, Claire E. Stevenson
TL;DR
This work addresses the challenge of comparing visual creativity across humans and AI by assembling a diverse dataset of 1338 drawings from children, adults, and AI, all on the same MTCI-generated stimuli $G$, $I$, and $R$. It introduces a content–style framework, with four style metrics and multimodal content measures derived from CLIP embeddings and GPT-4o captions, enabling cross-agent creativity analysis. The study finds distinct group differences—AI shows higher ink density, children produce more components, and adults exhibit the greatest conceptual diversity—and reveals a marked misalignment between expert and automated creativity ratings, underscoring the need for multi-faceted evaluation. The proposed framework and dataset provide domain-agnostic insights into creativity and offer practical paths to align AI outputs with human creative judgments, with data and code publicly available on GitHub.
Abstract
Can we derive computational metrics to quantify visual creativity in drawings across intelligent agents, while accounting for inherent differences in technical skill and style? To answer this, we curate a novel dataset consisting of 1338 drawings by children, adults and AI on a creative drawing task. We characterize two aspects of the drawings -- (1) style and (2) content. For style, we define measures of ink density, ink distribution and number of elements. For content, we use expert-annotated categories to study conceptual diversity, and image and text embeddings to compute distance measures. We compare the style, content and creativity of children, adults and AI drawings and build simple models to predict expert and automated creativity scores. We find significant differences in style and content in the groups -- children's drawings had more components, AI drawings had greater ink density, and adult drawings revealed maximum conceptual diversity. Notably, we highlight a misalignment between creativity judgments obtained through expert and automated ratings and discuss its implications. Through these efforts, our work provides, to the best of our knowledge, the first framework for studying human and artificial creativity beyond the textual modality, and attempts to arrive at the domain-agnostic principles underlying creativity. Our data and scripts are available on GitHub.
