OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Alisha Srivastava, Emir Korukluoglu, Minh Nhat Le, Duyen Tran, Chau Minh Pham, Marzena Karpinska, Mohit Iyyer
TL;DR
This study probes multilingual and cross-lingual memorization in large language models with OWL, a dataset of 31,540 aligned literary passages across ten languages (including English originals, official translations, and six newly translated languages) plus audio. It introduces three probing tasks—direct probing, name cloze, and prefix probing—to quantify memorization and cross-lingual transfer, alongside perturbations, quantization, and audio ablation analyses. The results show that models can recall content across languages, with stronger performance on direct recall than on cloze-style prompts, and that cross-lingual transfer occurs even for unseen translations, though with lower accuracy. Character names emerge as a strong cue for recall, and perturbations or modality changes reduce but do not obliterate memorized knowledge; the authors release dataset and code to spur further research while acknowledging limitations around translation quality, training data opacity, and legal/ethical considerations.
Abstract
Large language models (LLMs) are known to memorize and recall English text from their pretraining data. However, the extent to which this ability generalizes to non-English languages or transfers across languages remains unclear. This paper investigates multilingual and cross-lingual memorization in LLMs, probing if memorized content in one language (e.g., English) can be recalled when presented in translation. To do so, we introduce OWL, a dataset of 31.5K aligned excerpts from 20 books in ten languages, including English originals, official translations (Vietnamese, Spanish, Turkish), and new translations in six low-resource languages (Sesotho, Yoruba, Maithili, Malagasy, Setswana, Tahitian). We evaluate memorization across model families and sizes through three tasks: (1) direct probing, which asks the model to identify a book's title and author; (2) name cloze, which requires predicting masked character names; and (3) prefix probing, which involves generating continuations. We find that LLMs consistently recall content across languages, even for texts without direct translation in pretraining data. GPT-4o, for example, identifies authors and titles 69% of the time and masked entities 6% of the time in newly translated excerpts. Perturbations (e.g., masking characters, shuffling words) modestly reduce direct probing accuracy (7% drop for shuffled official translations). Our results highlight the extent of cross-lingual memorization and provide insights on the differences between the models.
