An information-theoretic model of shallow and deep language comprehension
Jiaxuan Li, Richard Futrell
TL;DR
The paper addresses how online language comprehension can be shallow yet accurate under resource constraints by formalizing a rate-distortion-type trade-off between processing depth and distortion. It introduces an information-theoretic model where the interpretation policy $p_t(w|x)$ minimizes expected distortion $\mathbb{E}[d(w,x)]$ subject to a time-varying depth constraint, yielding $p_t(w|x)\propto p_0(w) e^{-{\lambda_0 t d(w,x)}}$ and $D(t)\! =\! \mathbb{E}_{p(x)}[D_{KL}(p_t(w|x)\|p_0(w))]$. The authors connect processing depth and effort to measurable signals, proposing that EEG and reading times reflect $D'(t)$ and demonstrating that the model can simulate N400 and P600 ERP patterns as well as garden-path reading-time effects. Empirical validations on EEG datasets and garden-path reading times show the framework accounts for both behavioral and neural data, offering a resource-rational account of shallow-to-deep processing and a unified view of language comprehension dynamics.
Abstract
A large body of work in psycholinguistics has focused on the idea that online language comprehension can be shallow or `good enough': given constraints on time or available computation, comprehenders may form interpretations of their input that are plausible but inaccurate. However, this idea has not yet been linked with formal theories of computation under resource constraints. Here we use information theory to formulate a model of language comprehension as an optimal trade-off between accuracy and processing depth, formalized as bits of information extracted from the input, which increases with processing time. The model provides a measure of processing effort as the change in processing depth, which we link to EEG signals and reading times. We validate our theory against a large-scale dataset of garden path sentence reading times, and EEG experiments featuring N400, P600 and biphasic ERP effects. By quantifying the timecourse of language processing as it proceeds from shallow to deep, our model provides a unified framework to explain behavioral and neural signatures of language comprehension.
