The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

Gary Lupyan; Senyi Yang

The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

Gary Lupyan, Senyi Yang

TL;DR

Although the abilities of LLMs to make sense of "Jabberwockified" English are clearly superhuman, they are highly relevant to understanding linguistic structure and suggest that efficient language processing either in biological or artificial systems likely benefits from very tight integration between syntax, lexical semantics, and general world knowledge.

Abstract

We show that large language models (LLMs) have an astonishing ability to recover meaning from severely degraded English texts. Texts in which content words have been randomly substituted by nonsense strings, e.g., "At the ghybe of the swuint, we are haiveed to Wourge Phrear-gwurr, who sproles into an ghitch flount with his crurp", can be translated to conventional English that is, in many cases, close to the original text, e.g., "At the start of the story, we meet a man, Chow, who moves into an apartment building with his wife." These results show that structural cues (e.g., morphosyntax, closed-class words) constrain lexical meaning to a much larger degree than imagined. Although the abilities of LLMs to make sense of "Jabberwockified" English are clearly superhuman, they are highly relevant to understanding linguistic structure and suggest that efficient language processing either in biological or artificial systems likely benefits from very tight integration between syntax, lexical semantics, and general world knowledge.

The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

TL;DR

Abstract

Paper Structure (14 sections, 4 figures, 4 tables)

This paper contains 14 sections, 4 figures, 4 tables.

The power of linguistic constructions
Key results that led to the present study
Methods
Procedure to create Jabberwockified texts.
Nonce word selection.
LLM selection.
Stimuli.
Translating and evaluating Jabberwockified texts
Additional manipulations
Results
Discussion
The role of pretraining.
What about people?
Limitations

Figures (4)

Figure 1: Comparison of translation accuracy by genre. Gray dots show similarity between the passage and a random non-target passage from the same genre
Figure 2: Translation accuracy for content-matched passages that were guaranteed to not be in the pretraining (offline) or were in the pretraining (online)
Figure 3: A comparison of meaning recovery of the 150-passage dataset when degraded through the standard Jabberwocky procedure and five variants (see Table 3 for details). Error bars show 95% within-passage CI.
Figure 4: Meaning recovery for a subset of passages that were tested incrementally, one sentence at a time, along with their original provenance (S=Screenplay; F=Fiction excerpt).

The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

TL;DR

Abstract

The Astonishing Ability of Large Language Models to Parse Jabberwockified Language

Authors

TL;DR

Abstract

Table of Contents

Figures (4)