Table of Contents
Fetching ...

Characterising the Creative Process in Humans and Large Language Models

Surabhi S. Nath, Peter Dayan, Claire Stevenson

TL;DR

The paper addresses how creativity emerges in humans versus large language models by moving beyond product assessment to analyze the creative process. It introduces an automated, data‑driven pipeline that uses sentence embeddings to categorize responses and signal jumps between semantic spaces during sequential idea generation on the AUT and a VFT, enabling direct comparison with LLM behavior. Results reveal persistent and flexible creativity pathways in humans and reveal task‑dependent biases in LLMs, with LLMs showing higher originality when flexibility is emphasized and, overall, higher AUT originality than humans. This work provides a principled framework for studying and deploying AI as artificial participants or co‑creators in creative tasks.

Abstract

Large language models appear quite creative, often performing on par with the average human on creative tasks. However, research on LLM creativity has focused solely on \textit{products}, with little attention on the creative \textit{process}. Process analyses of human creativity often require hand-coded categories or exploit response times, which do not apply to LLMs. We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task, and contrast with behaviour in a Verbal Fluency Task. We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles. Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity, where both pathways lead to similar creativity scores. LLMs were found to be biased towards either persistent or flexible paths, that varied across tasks. Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity. Our dataset and scripts are available on \href{https://github.com/surabhisnath/Creative_Process}{GitHub}.

Characterising the Creative Process in Humans and Large Language Models

TL;DR

The paper addresses how creativity emerges in humans versus large language models by moving beyond product assessment to analyze the creative process. It introduces an automated, data‑driven pipeline that uses sentence embeddings to categorize responses and signal jumps between semantic spaces during sequential idea generation on the AUT and a VFT, enabling direct comparison with LLM behavior. Results reveal persistent and flexible creativity pathways in humans and reveal task‑dependent biases in LLMs, with LLMs showing higher originality when flexibility is emphasized and, overall, higher AUT originality than humans. This work provides a principled framework for studying and deploying AI as artificial participants or co‑creators in creative tasks.

Abstract

Large language models appear quite creative, often performing on par with the average human on creative tasks. However, research on LLM creativity has focused solely on \textit{products}, with little attention on the creative \textit{process}. Process analyses of human creativity often require hand-coded categories or exploit response times, which do not apply to LLMs. We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task, and contrast with behaviour in a Verbal Fluency Task. We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles. Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity, where both pathways lead to similar creativity scores. LLMs were found to be biased towards either persistent or flexible paths, that varied across tasks. Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity. Our dataset and scripts are available on \href{https://github.com/surabhisnath/Creative_Process}{GitHub}.
Paper Structure (13 sections, 3 figures)

This paper contains 13 sections, 3 figures.

Figures (3)

  • Figure 1: Example persistent, flexible and mixed response sequences. $r_{i}$ denotes the $i^{th}$ response, coloured regions denote the semantic spaces/concepts/categories. Note, in practice, most sequences will be mixed, containing different patterns of persistence and flexibility.
  • Figure 2: (A) Humans and LLMs perform 3 tasks---Alternate Uses Task (AUT) for brick and paperclip, and a Verbal Fluency Task (VFT) of naming animals. (B) Our method for obtaining jumps in the response sequence. Sentence embeddings are used for assigning response categories and evaluating semantic similarities, which respectively give jump$_{cat}$ and jump$_{\SS}$. Their logical AND gives jump.
  • Figure 3: (A) 3 human clusters for each task--persistent, flexible and mixed. Each coloured trajectory represents 1 participant. Percentages in each row indicate the percentage of participants assigned to that cluster. (B) Percentages of each LLM response sequences assigned to each cluster. * indicates not all temperatures for that model were included (0.4-1 for Mistral, and 0.7-1 for NousResearch were used).