Table of Contents
Fetching ...

Large Language Models Preserve Semantic Isotopies in Story Continuations

Marc Cavazza

TL;DR

This work investigates whether large language models preserve semantic isotopies—structural semantic threads—when continuing stories. By constructing a 10,000-story ROCStories-based pipeline across five LLMs and using GPT-4o to extract isotopies, the authors quantify isotopy coverage, density, and spread in completed texts and analyze semantic properties like relevance and label distributions. They find that isotopies persist through completion, with high coverage, comparable density to human benchmarks, and near-even spread, while exhibiting topic convergence yet lexical variation across models. The study advances the case for a textual-semantic lens on LLM behavior and suggests practical uses for assessing cohesion and guiding training, while acknowledging horizon-length and interpretative limitations. Overall, the results provide initial empirical support that LLM-generated text maintains meaningful isotopic structure consistent with structural semantics.

Abstract

In this work, we explore the relevance of textual semantics to Large Language Models (LLMs), extending previous insights into the connection between distributional semantics and structural semantics. We investigate whether LLM-generated texts preserve semantic isotopies. We design a story continuation experiment using 10,000 ROCStories prompts completed by five LLMs. We first validate GPT-4o's ability to extract isotopies from a linguistic benchmark, then apply it to the generated stories. We then analyze structural (coverage, density, spread) and semantic properties of isotopies to assess how they are affected by completion. Results show that LLM completion within a given token horizon preserves semantic isotopies across multiple properties.

Large Language Models Preserve Semantic Isotopies in Story Continuations

TL;DR

This work investigates whether large language models preserve semantic isotopies—structural semantic threads—when continuing stories. By constructing a 10,000-story ROCStories-based pipeline across five LLMs and using GPT-4o to extract isotopies, the authors quantify isotopy coverage, density, and spread in completed texts and analyze semantic properties like relevance and label distributions. They find that isotopies persist through completion, with high coverage, comparable density to human benchmarks, and near-even spread, while exhibiting topic convergence yet lexical variation across models. The study advances the case for a textual-semantic lens on LLM behavior and suggests practical uses for assessing cohesion and guiding training, while acknowledging horizon-length and interpretative limitations. Overall, the results provide initial empirical support that LLM-generated text maintains meaningful isotopic structure consistent with structural semantics.

Abstract

In this work, we explore the relevance of textual semantics to Large Language Models (LLMs), extending previous insights into the connection between distributional semantics and structural semantics. We investigate whether LLM-generated texts preserve semantic isotopies. We design a story continuation experiment using 10,000 ROCStories prompts completed by five LLMs. We first validate GPT-4o's ability to extract isotopies from a linguistic benchmark, then apply it to the generated stories. We then analyze structural (coverage, density, spread) and semantic properties of isotopies to assess how they are affected by completion. Results show that LLM completion within a given token horizon preserves semantic isotopies across multiple properties.

Paper Structure

This paper contains 17 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: Isotopy across a story completion. Isotopy constituents are highlighted, and the pivot marks the end of the story primer (see text for details).
  • Figure 2: System and User prompt for isotopy extraction (note the various steps inspired from the interpretative approach).
  • Figure 3: GPT-4o performance on the isotopy benchmark. 50% of isotopy labels are literal matches, with a further 19.4% matching with a multilingual embedding.
  • Figure 4: Completion ratio (generated text to original primer) across all five LLM.
  • Figure 5: Global Coverage for isotopies extracted for each LLM--generated Completion. All models achieve significant coverage of the text.
  • ...and 6 more figures