Table of Contents
Fetching ...

Previously on the Stories: Recap Snippet Identification for Story Reading

Jiangnan Li, Qiujing Wang, Liyan Xu, Wenjie Pang, Mo Yu, Zheng Lin, Weiping Wang, Jie Zhou

TL;DR

This work introduces Recap Snippet Identification, a task aimed at identifying prior-context recap snippets that are temporally and causally linked to a target snippet in books and TV productions. It presents RECIDENT, a hand-crafted benchmark with book and TV domains, including expert annotations, cross-language alignment, and a consistent 60-snippet history window. The study evaluates prompting-based LLMs, unsupervised Line2Note training, and supervised fine-tuning, revealing a gap between human performance and current models, and showing that Line2Note enhances similarity-based models while LLMs struggle as direct rankers. Findings highlight that proximity and explicit event information aid recap identification, and propose a practical pipeline combining lightweight models with LLM prompts to balance performance and efficiency, with implications for reading apps and narrative understanding systems.

Abstract

Similar to the "previously-on" scenes in TV shows, recaps can help book reading by recalling the readers' memory about the important elements in previous texts to better understand the ongoing plot. Despite its usefulness, this application has not been well studied in the NLP community. We propose the first benchmark on this useful task called Recap Snippet Identification with a hand-crafted evaluation dataset. Our experiments show that the proposed task is challenging to PLMs, LLMs, and proposed methods as the task requires a deep understanding of the plot correlation between snippets.

Previously on the Stories: Recap Snippet Identification for Story Reading

TL;DR

This work introduces Recap Snippet Identification, a task aimed at identifying prior-context recap snippets that are temporally and causally linked to a target snippet in books and TV productions. It presents RECIDENT, a hand-crafted benchmark with book and TV domains, including expert annotations, cross-language alignment, and a consistent 60-snippet history window. The study evaluates prompting-based LLMs, unsupervised Line2Note training, and supervised fine-tuning, revealing a gap between human performance and current models, and showing that Line2Note enhances similarity-based models while LLMs struggle as direct rankers. Findings highlight that proximity and explicit event information aid recap identification, and propose a practical pipeline combining lightweight models with LLM prompts to balance performance and efficiency, with implications for reading apps and narrative understanding systems.

Abstract

Similar to the "previously-on" scenes in TV shows, recaps can help book reading by recalling the readers' memory about the important elements in previous texts to better understand the ongoing plot. Despite its usefulness, this application has not been well studied in the NLP community. We propose the first benchmark on this useful task called Recap Snippet Identification with a hand-crafted evaluation dataset. Our experiments show that the proposed task is challenging to PLMs, LLMs, and proposed methods as the task requires a deep understanding of the plot correlation between snippets.
Paper Structure (36 sections, 3 equations, 11 figures, 9 tables)

This paper contains 36 sections, 3 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: The proportions of recap snippets in different ranges of the distance between the target and the recap of all datasets. The frequency of recap snippets appearing is in a downtrend with the increasing distance.
  • Figure 2: Reader notes can be the bridge to connect two underlined snippets they attach to. The example shows that both notes comment on the Death of Dane (from The Thorn Birds).
  • Figure 3: Performance of l2n when only considering the nearest 20/40/60 snippets on NDDP and DGSD. As the distance increases, the identification becomes harder.
  • Figure 4: Listwise top5 guessing prompt for TV Productions.
  • Figure 5: Listwise freely guessing prompt for TV Productions.
  • ...and 6 more figures