Table of Contents
Fetching ...

SRS-Stories: Vocabulary-constrained multilingual story generation for language learning

Wiktor Kamzela, Mateusz Lango, Ondrej Dusek

TL;DR

<3-5 sentence high-level summary> This paper addresses vocabulary acquisition for language learners by integrating a Spaced Repetition System with automatic story generation that confines text to the learner’s known vocabulary plus targeted new words. The authors propose three prompting strategies and three lexical-constraint enforcement methods, validated across English, Chinese, and Polish, and benchmark against constrained beam search. They show that LLM-driven SRS-Stories produce more grammatical, coherent, and contextually rich examples of word usage, with strong multilingual performance and positive human judgments. The work highlights the potential of vocabulary-constrained storytelling as a scalable, engaging alternative to traditional flashcards for language learning, while noting limitations in bias, OOV handling, and cross-language metric reliability.

Abstract

In this paper, we use large language models to generate personalized stories for language learners, using only the vocabulary they know. The generated texts are specifically written to teach the user new vocabulary by simply reading stories where it appears in context, while at the same time seamlessly reviewing recently learned vocabulary. The generated stories are enjoyable to read and the vocabulary reviewing/learning is optimized by a Spaced Repetition System. The experiments are conducted in three languages: English, Chinese and Polish, evaluating three story generation methods and three strategies for enforcing lexical constraints. The results show that the generated stories are more grammatical, coherent, and provide better examples of word usage than texts generated by the standard constrained beam search approach

SRS-Stories: Vocabulary-constrained multilingual story generation for language learning

TL;DR

<3-5 sentence high-level summary> This paper addresses vocabulary acquisition for language learners by integrating a Spaced Repetition System with automatic story generation that confines text to the learner’s known vocabulary plus targeted new words. The authors propose three prompting strategies and three lexical-constraint enforcement methods, validated across English, Chinese, and Polish, and benchmark against constrained beam search. They show that LLM-driven SRS-Stories produce more grammatical, coherent, and contextually rich examples of word usage, with strong multilingual performance and positive human judgments. The work highlights the potential of vocabulary-constrained storytelling as a scalable, engaging alternative to traditional flashcards for language learning, while noting limitations in bias, OOV handling, and cross-language metric reliability.

Abstract

In this paper, we use large language models to generate personalized stories for language learners, using only the vocabulary they know. The generated texts are specifically written to teach the user new vocabulary by simply reading stories where it appears in context, while at the same time seamlessly reviewing recently learned vocabulary. The generated stories are enjoyable to read and the vocabulary reviewing/learning is optimized by a Spaced Repetition System. The experiments are conducted in three languages: English, Chinese and Polish, evaluating three story generation methods and three strategies for enforcing lexical constraints. The results show that the generated stories are more grammatical, coherent, and provide better examples of word usage than texts generated by the standard constrained beam search approach

Paper Structure

This paper contains 43 sections, 1 figure, 13 tables.

Figures (1)

  • Figure 1: Overview of a classic SRS system and the presented approach: SRS-Stories.