Table of Contents
Fetching ...

Annotating References to Mythological Entities in French Literature

Thierry Poibeau

TL;DR

The paper evaluates large language models for annotating and interpreting references to mythological entities in contemporary French literature using a dedicated annotation scheme. It demonstrates high annotation accuracy with LLMs but reveals substantial risks in retrieval tasks, where models hallucinate fabricated passages, raising ethical and reproducibility concerns. The authors propose a mythological entity tagset grounded in existing ontologies and show that LLMs can provide interpretive insights, while highlighting the need for human-in-the-loop validation. Scaling up to a corpus of roughly 15,000 novels could enable wide-ranging cultural analytics, revealing how myths are repurposed across genres, periods, and authors. The work contributes practical guidelines for annotation, evidence about the strengths and limits of LLMs in literary analysis, and a roadmap for future large-scale mythography studies.

Abstract

In this paper, we explore the relevance of large language models (LLMs) for annotating references to Roman and Greek mythological entities in modern and contemporary French literature. We present an annotation scheme and demonstrate that recent LLMs can be directly applied to follow this scheme effectively, although not without occasionally making significant analytical errors. Additionally, we show that LLMs (and, more specifically, ChatGPT) are capable of offering interpretative insights into the use of mythological references by literary authors. However, we also find that LLMs struggle to accurately identify relevant passages in novels (when used as an information retrieval engine), often hallucinating and generating fabricated examples-an issue that raises significant ethical concerns. Nonetheless, when used carefully, LLMs remain valuable tools for performing annotations with high accuracy, especially for tasks that would be difficult to annotate comprehensively on a large scale through manual methods alone.

Annotating References to Mythological Entities in French Literature

TL;DR

The paper evaluates large language models for annotating and interpreting references to mythological entities in contemporary French literature using a dedicated annotation scheme. It demonstrates high annotation accuracy with LLMs but reveals substantial risks in retrieval tasks, where models hallucinate fabricated passages, raising ethical and reproducibility concerns. The authors propose a mythological entity tagset grounded in existing ontologies and show that LLMs can provide interpretive insights, while highlighting the need for human-in-the-loop validation. Scaling up to a corpus of roughly 15,000 novels could enable wide-ranging cultural analytics, revealing how myths are repurposed across genres, periods, and authors. The work contributes practical guidelines for annotation, evidence about the strengths and limits of LLMs in literary analysis, and a roadmap for future large-scale mythography studies.

Abstract

In this paper, we explore the relevance of large language models (LLMs) for annotating references to Roman and Greek mythological entities in modern and contemporary French literature. We present an annotation scheme and demonstrate that recent LLMs can be directly applied to follow this scheme effectively, although not without occasionally making significant analytical errors. Additionally, we show that LLMs (and, more specifically, ChatGPT) are capable of offering interpretative insights into the use of mythological references by literary authors. However, we also find that LLMs struggle to accurately identify relevant passages in novels (when used as an information retrieval engine), often hallucinating and generating fabricated examples-an issue that raises significant ethical concerns. Nonetheless, when used carefully, LLMs remain valuable tools for performing annotations with high accuracy, especially for tasks that would be difficult to annotate comprehensively on a large scale through manual methods alone.

Paper Structure

This paper contains 11 sections, 3 figures.

Figures (3)

  • Figure 1: Examples of hallucinations by ChatGPT when tasked with retrieving passages from specific novels that reference mythology.
  • Figure 2: An example of ChatGPT apologizing for an error but then repeating the same type of mistake by generating another fabricated quotation.
  • Figure 3: NotebookLM confused by the extracts it previously identified from the input text.