Table of Contents
Fetching ...

Interpreting Themes from Educational Stories

Yigeng Zhang, Fabio A. González, Thamar Solorio

TL;DR

The paper addresses interpretive comprehension in NLP by introducing EduStory, the first dataset tailored for inferring themes in educational narratives. It defines four related tasks—theme keyword identification, story-theme matching, story reading comprehension on themes, and theme generation—and provides extensive empirical analysis across traditional ML models and large language models. Results reveal that theme interpretation remains challenging even for state-of-the-art methods, with human judges demonstrating that LLMs like ChatGPT can produce high-quality interpretations but are not universally reliable. EduStory thus offers a valuable benchmark for advancing narrative understanding beyond literal content, with potential impacts in education, moral reasoning, and AI-assisted storytelling, and the authors provide the dataset and code publicly for ongoing research.

Abstract

Reading comprehension continues to be a crucial research focus in the NLP community. Recent advances in Machine Reading Comprehension (MRC) have mostly centered on literal comprehension, referring to the surface-level understanding of content. In this work, we focus on the next level - interpretive comprehension, with a particular emphasis on inferring the themes of a narrative text. We introduce the first dataset specifically designed for interpretive comprehension of educational narratives, providing corresponding well-edited theme texts. The dataset spans a variety of genres and cultural origins and includes human-annotated theme keywords with varying levels of granularity. We further formulate NLP tasks under different abstractions of interpretive comprehension toward the main idea of a story. After conducting extensive experiments with state-of-the-art methods, we found the task to be both challenging and significant for NLP research. The dataset and source code have been made publicly available to the research community at https://github.com/RiTUAL-UH/EduStory.

Interpreting Themes from Educational Stories

TL;DR

The paper addresses interpretive comprehension in NLP by introducing EduStory, the first dataset tailored for inferring themes in educational narratives. It defines four related tasks—theme keyword identification, story-theme matching, story reading comprehension on themes, and theme generation—and provides extensive empirical analysis across traditional ML models and large language models. Results reveal that theme interpretation remains challenging even for state-of-the-art methods, with human judges demonstrating that LLMs like ChatGPT can produce high-quality interpretations but are not universally reliable. EduStory thus offers a valuable benchmark for advancing narrative understanding beyond literal content, with potential impacts in education, moral reasoning, and AI-assisted storytelling, and the authors provide the dataset and code publicly for ongoing research.

Abstract

Reading comprehension continues to be a crucial research focus in the NLP community. Recent advances in Machine Reading Comprehension (MRC) have mostly centered on literal comprehension, referring to the surface-level understanding of content. In this work, we focus on the next level - interpretive comprehension, with a particular emphasis on inferring the themes of a narrative text. We introduce the first dataset specifically designed for interpretive comprehension of educational narratives, providing corresponding well-edited theme texts. The dataset spans a variety of genres and cultural origins and includes human-annotated theme keywords with varying levels of granularity. We further formulate NLP tasks under different abstractions of interpretive comprehension toward the main idea of a story. After conducting extensive experiments with state-of-the-art methods, we found the task to be both challenging and significant for NLP research. The dataset and source code have been made publicly available to the research community at https://github.com/RiTUAL-UH/EduStory.
Paper Structure (32 sections, 3 figures, 9 tables)

This paper contains 32 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: An example of theme interpretation.
  • Figure 2: The distribution of word count of the story collection.
  • Figure 3: The statistical plot of the cultural origins of the stories in the dataset.