Graded strength of comparative illusions is explained by Bayesian inference
Yuhan Zhang, Erxiao Wang, Cory Shain
TL;DR
The paper investigates how the comparative illusion (CI) in language can be explained by a noisy-channel Bayesian framework, predicting acceptability via the posterior $p(s_i|s_p)$. It couples language-model priors with a data-driven noise likelihood derived from human sentence-correction data, yielding an approximate posterior $\hat{p}(s_i|s_p)$ that accounts for graded illusion strength and subject-type effects. Through large-scale Experiment 1 and a corrective Experiment 2, it shows that averaging over multiple plausible interpretations (the $f_{mean}$ linking function) better explains acceptability than focusing on a single most-likely interpretation, supporting a probabilistic, multi-interpretation processing account. The findings advance a unified, computational-level view of language processing, with acceptability judgments reflecting real-time posterior probabilities and suggesting generalizable principles for diverse language illusions.
Abstract
Like visual processing, language processing is susceptible to illusions in which people systematically misperceive stimuli. In one such case--the comparative illusion (CI), e.g., More students have been to Russia than I have--comprehenders tend to judge the sentence as acceptable despite its underlying nonsensical comparison. Prior research has argued that this phenomenon can be explained as Bayesian inference over a noisy channel: the posterior probability of an interpretation of a sentence is proportional to both the prior probability of that interpretation and the likelihood of corruption into the observed (CI) sentence. Initial behavioral work has supported this claim by evaluating a narrow set of alternative interpretations of CI sentences and showing that comprehenders favor interpretations that are more likely to have been corrupted into the illusory sentence. In this study, we replicate and go substantially beyond this earlier work by directly predicting the strength of illusion with a quantitative model of the posterior probability of plausible interpretations, which we derive through a novel synthesis of statistical language models with human behavioral data. Our model explains not only the fine gradations in the strength of CI effects, but also a previously unexplained effect caused by pronominal vs. full noun phrase than-clause subjects. These findings support a noisy-channel theory of sentence comprehension by demonstrating that the theory makes novel predictions about the comparative illusion that bear out empirically. This outcome joins related evidence of noisy channel processing in both illusory and non-illusory contexts to support noisy channel inference as a unified computational-level theory of diverse language processing phenomena.
