Generative AI and Creativity: A Systematic Literature Review and Meta-Analysis
Niklas Holzner, Sebastian Maier, Stefan Feuerriegel
TL;DR
This study tackles the question of whether Generative AI (GenAI) can match or enhance human creativity and how collaboration between humans and GenAI affects creativity and idea diversity. Using a PRISMA-based systematic review and random-effects meta-analysis of 28 studies (127 effect sizes, n=8214), the authors compare GenAI alone to humans, humans with GenAI, and the diversity of ideas produced. They report three core findings: GenAI's standalone creativity is about on par with human performance (g ≈ $-0.05$), human-GenAI collaboration yields a modest creativity boost (g ≈ $0.27$), but collaboration markedly reduces diversity (g ≈ $-0.86$), with substantial heterogeneity moderated by GenAI model, task, and participant background. The results suggest GenAI is best treated as an augmentation tool rather than a replacement for human creativity, and they highlight design and domain considerations to mitigate diversity losses in practical applications.
Abstract
Generative artificial intelligence (GenAI) is increasingly used to support a wide range of human tasks, yet empirical evidence on its effect on creativity remains scattered. Can GenAI generate ideas that are creative? To what extent can it support humans in generating ideas that are both creative and diverse? In this study, we conduct a meta-analysis to evaluate the effect of GenAI on the performance in creative tasks. For this, we first perform a systematic literature search, based on which we identify n = 28 relevant studies (m = 8214 participants) for inclusion in our meta-analysis. We then compute standardized effect sizes based on Hedges' g. We compare different outcomes: (i) how creative GenAI is; (ii) how creative humans augmented by GenAI are; and (iii) the diversity of ideas by humans augmented by GenAI. Our results show no significant difference in creative performance between GenAI and humans (g = -0.05), while humans collaborating with GenAI significantly outperform those working without assistance (g = 0.27). However, GenAI has a significant negative effect on the diversity of ideas for such collaborations between humans and GenAI (g = -0.86). We further analyze heterogeneity across different GenAI models (e.g., GPT-3.5, GPT-4), different tasks (e.g., creative writing, ideation, divergent thinking), and different participant populations (e.g., laypeople, business, academia). Overall, our results position GenAI as an augmentative tool that can support, rather than replace, human creativity-particularly in tasks benefiting from ideation support.
