Table of Contents
Fetching ...

The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry

Anna Marklová, Ondřej Vinš, Martina Vokáčová, Jiří Milička

TL;DR

This study tests whether Czech native speakers can distinguish AI- from human-authored poetry and how authorship beliefs affect evaluation. Using 16 human–AI poem pairs across modern and nonsense Czech poetry, the authors find recognition at near-chance levels, with better discrimination for nonsense and a strong authorship-bias in aesthetics: participants rate AI-generated poems less favorably when they believe they were AI, while actual evaluations show AI poems can be judged as favorable as or more favorable than human ones. The work demonstrates AI's capacity to produce stylistically convincing Czech poetry, including nonsense verse, and highlights the interplay between reader beliefs and enjoyment. The findings have implications for the perception of AI in literary creativity, the role of authorship in art appreciation, and future cross-language AI evaluation of poetry.

Abstract

Large language models are increasingly capable of producing creative texts, yet most studies on AI-generated poetry focus on English -- a language that dominates training data. In this paper, we examine the perception of AI- and human-written Czech poetry. We ask if Czech native speakers are able to identify it and how they aesthetically judge it. Participants performed at chance level when guessing authorship (45.8\% correct on average), indicating that Czech AI-generated poems were largely indistinguishable from human-written ones. Aesthetic evaluations revealed a strong authorship bias: when participants believed a poem was AI-generated, they rated it as less favorably, even though AI poems were in fact rated equally or more favorably than human ones on average. The logistic regression model uncovered that the more the people liked a poem, the less probable was that they accurately assign the authorship. Familiarity with poetry or literary background had no effect on recognition accuracy. Our findings show that AI can convincingly produce poetry even in a morphologically complex, low-resource (with respect of the training data of AI models) Slavic language such as Czech. The results suggest that readers' beliefs about authorship and the aesthetic evaluation of the poem are interconnected.

The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry

TL;DR

This study tests whether Czech native speakers can distinguish AI- from human-authored poetry and how authorship beliefs affect evaluation. Using 16 human–AI poem pairs across modern and nonsense Czech poetry, the authors find recognition at near-chance levels, with better discrimination for nonsense and a strong authorship-bias in aesthetics: participants rate AI-generated poems less favorably when they believe they were AI, while actual evaluations show AI poems can be judged as favorable as or more favorable than human ones. The work demonstrates AI's capacity to produce stylistically convincing Czech poetry, including nonsense verse, and highlights the interplay between reader beliefs and enjoyment. The findings have implications for the perception of AI in literary creativity, the role of authorship in art appreciation, and future cross-language AI evaluation of poetry.

Abstract

Large language models are increasingly capable of producing creative texts, yet most studies on AI-generated poetry focus on English -- a language that dominates training data. In this paper, we examine the perception of AI- and human-written Czech poetry. We ask if Czech native speakers are able to identify it and how they aesthetically judge it. Participants performed at chance level when guessing authorship (45.8\% correct on average), indicating that Czech AI-generated poems were largely indistinguishable from human-written ones. Aesthetic evaluations revealed a strong authorship bias: when participants believed a poem was AI-generated, they rated it as less favorably, even though AI poems were in fact rated equally or more favorably than human ones on average. The logistic regression model uncovered that the more the people liked a poem, the less probable was that they accurately assign the authorship. Familiarity with poetry or literary background had no effect on recognition accuracy. Our findings show that AI can convincingly produce poetry even in a morphologically complex, low-resource (with respect of the training data of AI models) Slavic language such as Czech. The results suggest that readers' beliefs about authorship and the aesthetic evaluation of the poem are interconnected.

Paper Structure

This paper contains 19 sections, 13 figures, 2 tables.

Figures (13)

  • Figure 1: A scheme of creating the experiment materials.
  • Figure 2: Distribution of correctness rates across participants. The x-axis represents the proportion of correctly identified poems, and the y-axis shows the number of participants.
  • Figure 3: Average liking according to poem authorship
  • Figure 4: Average liking according to perceived poem authorship.
  • Figure 5: Average imaginativeness according to poem authorship
  • ...and 8 more figures