Table of Contents
Fetching ...

What Does AI Do for Cultural Interpretation? A Randomized Experiment on Close Reading Poems with Exposure to AI Interpretation

Jiayin Zhi, Hoyt Long, Richard Jean So, Mina Lee

TL;DR

It was found that single AI interpretation boosted both performance and pleasure, while multiple AI interpretations only improved performance, and participants who heavily relied on AI showed better performance on the task but lower pleasure.

Abstract

AI demonstrates unprecedented reasoning capabilities, but its increasing integration into human reasoning via automated reading and summarization has provoked debate about its use for cultural interpretation. Close reading -- the practice of understanding, analyzing, and critiquing cultural texts for pleasure -- is a skill at the core of such interpretation, traditionally being seen as exclusive to humans. To test AI's impact on close reading, both in terms of interpretative performance and pleasure, we conducted a preregistered randomized experiment (n=400) investigating the impact of AI assistance by presenting single or multiple AI interpretations, on close reading poems, compared to no AI assistance. We found that single AI interpretation boosted both performance and pleasure, while multiple AI interpretations only improved performance. Further exploration revealed a trade-off: participants who heavily relied on AI showed better performance on the task but lower pleasure. Our results contribute to discussion on whether and how to calibrate AI assistance for cultural interpretation: "less is more."

What Does AI Do for Cultural Interpretation? A Randomized Experiment on Close Reading Poems with Exposure to AI Interpretation

TL;DR

It was found that single AI interpretation boosted both performance and pleasure, while multiple AI interpretations only improved performance, and participants who heavily relied on AI showed better performance on the task but lower pleasure.

Abstract

AI demonstrates unprecedented reasoning capabilities, but its increasing integration into human reasoning via automated reading and summarization has provoked debate about its use for cultural interpretation. Close reading -- the practice of understanding, analyzing, and critiquing cultural texts for pleasure -- is a skill at the core of such interpretation, traditionally being seen as exclusive to humans. To test AI's impact on close reading, both in terms of interpretative performance and pleasure, we conducted a preregistered randomized experiment (n=400) investigating the impact of AI assistance by presenting single or multiple AI interpretations, on close reading poems, compared to no AI assistance. We found that single AI interpretation boosted both performance and pleasure, while multiple AI interpretations only improved performance. Further exploration revealed a trade-off: participants who heavily relied on AI showed better performance on the task but lower pleasure. Our results contribute to discussion on whether and how to calibrate AI assistance for cultural interpretation: "less is more."
Paper Structure (52 sections, 5 figures, 8 tables)

This paper contains 52 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Illustration of the three experimental conditions. Participants in all conditions interpret poems through close reading, with randomly assigned AI assistance: (left) Control condition where participants interpret alone, (center) AI-Single condition where one AI interpretation is provided, and (right) AI-Multiple condition where three AI interpretations are provided.
  • Figure 2: Study interface showing the two-step process for each poem across experimental conditions. (a) All participants first read the poem briefly. (b-d) Participants then complete the close reading task, identifying and explaining three stylistic features with assigned AI assistance: no assistance in Control, one AI interpretation provided in AI-Single, or three AI interpretations provided in AI-Multiple. The two-step process follows the task structure in the CRIT utaustin2024critical. The task structure remains identical across conditions; only the presence and amount of AI assistance varies.
  • Figure 3: Comparison of Interpretive Performance measures (Feature Identification, Interpretation Quality, and Writing Quality) between Control and AI-Single, and between Control and AI-Multiple within (a) inexperienced and (b) experienced readers. In each plot, the diamonds represent means, the box plots represent medians and interquartile ranges, and the violin plots represent distributions. Significance differences are annotated based on adjusted p-values after correction ( * $<0.05$, ** $<0.01$, *** $<0.001$).
  • Figure 4: Comparison of Subjective Experience measures (Appreciation, Enjoyment, and Self-efficacy) between Control and AI-Single, and between Control and AI-Multiple within (a) inexperienced and (b) experienced readers. In each plot, the diamonds show means, the box plots show medians and interquartile ranges, and the violin plots show distributions. Significance differences are annotated based on adjusted p-values after correction ( * $<0.05$, ** $<0.01$, *** $<0.001$).
  • Figure 5: Left: Layout of the four areas during the poetry interpretation task in the study interface. Right: Averaged percentage of time participants spent with their cursor in each area. The data shows how participants potentially allocated their attention across the question area (interpretation instructions), poem area (poem text), AI area (not available in Control), and answer area (participant answer fields) during the task.