Table of Contents
Fetching ...

Modeling Bottom-up Information Quality during Language Processing

Cui Ding, Yanning Yin, Lena A. Jäger, Ethan Gotlieb Wilcox

TL;DR

The paper addresses how bottom-up information quality affects reading by formalizing it as the mutual information between visual input and word identity and modeling reading as a Bayesian update. Using English and Chinese data, it introduces a half-occlusion manipulation to probe information distribution across word halves and employs MoTR to measure reading times, complemented by multimodal language models to estimate $I(W;O)$ and its pointwise variant IG. Key findings show that reduced information quality leads to disproportionate slowdowns, with nonlinearity and language-specific asymmetries; upper halves carry more informative content than lower halves in both scripts, more so in English. The study provides a principled, quantitative link between bottom-up signal quality and processing effort, with implications for theories of reading and perceptual decision-making, and demonstrates how cross-linguistic visual structure can shape lexical processing.

Abstract

Contemporary theories model language processing as integrating both top-down expectations and bottom-up inputs. One major prediction of such models is that the quality of the bottom-up inputs modulates ease of processing -- noisy inputs should lead to difficult and effortful comprehension. We test this prediction in the domain of reading. First, we propose an information-theoretic operationalization for the "quality" of bottom-up information as the mutual information (MI) between visual information and word identity. We formalize this prediction in a mathematical model of reading as a Bayesian update. Second, we test our operationalization by comparing participants' reading times in conditions where words' information quality has been reduced, either by occluding their top or bottom half, with full words. We collect data in English and Chinese. We then use multimodal language models to estimate the mutual information between visual inputs and words. We use these data to estimate the specific effect of reduced information quality on reading times. Finally, we compare how information is distributed across visual forms. In English and Chinese, the upper half contains more information about word identity than the lower half. However, the asymmetry is more pronounced in English, a pattern which is reflected in the reading times.

Modeling Bottom-up Information Quality during Language Processing

TL;DR

The paper addresses how bottom-up information quality affects reading by formalizing it as the mutual information between visual input and word identity and modeling reading as a Bayesian update. Using English and Chinese data, it introduces a half-occlusion manipulation to probe information distribution across word halves and employs MoTR to measure reading times, complemented by multimodal language models to estimate and its pointwise variant IG. Key findings show that reduced information quality leads to disproportionate slowdowns, with nonlinearity and language-specific asymmetries; upper halves carry more informative content than lower halves in both scripts, more so in English. The study provides a principled, quantitative link between bottom-up signal quality and processing effort, with implications for theories of reading and perceptual decision-making, and demonstrates how cross-linguistic visual structure can shape lexical processing.

Abstract

Contemporary theories model language processing as integrating both top-down expectations and bottom-up inputs. One major prediction of such models is that the quality of the bottom-up inputs modulates ease of processing -- noisy inputs should lead to difficult and effortful comprehension. We test this prediction in the domain of reading. First, we propose an information-theoretic operationalization for the "quality" of bottom-up information as the mutual information (MI) between visual information and word identity. We formalize this prediction in a mathematical model of reading as a Bayesian update. Second, we test our operationalization by comparing participants' reading times in conditions where words' information quality has been reduced, either by occluding their top or bottom half, with full words. We collect data in English and Chinese. We then use multimodal language models to estimate the mutual information between visual inputs and words. We use these data to estimate the specific effect of reduced information quality on reading times. Finally, we compare how information is distributed across visual forms. In English and Chinese, the upper half contains more information about word identity than the lower half. However, the asymmetry is more pronounced in English, a pattern which is reflected in the reading times.

Paper Structure

This paper contains 28 sections, 12 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Example showing a screen from a MoTR trial with our three different reading conditions.
  • Figure 2: Results of fine-tuned Qwen2.5 model for the Chinese character 美("beautiful") and the English word hear. The preference for hear over heal in upper half occlusion likely reflects pre-training frequency bias, which we control for by training TransOCR from scratch.
  • Figure 3: (a) Reading times (FPRT) measured under three visibility conditions. Boxes represent the interquartile range (middle 50%), center lines indicate the median, and whiskers show the overall data spread. Grey lines trace each participant’s mean across conditions. EN: English; ZH: Simplified Chinese (b) Information gain (IG) between word identity and visual form under the three conditions, obtained with each of our estimation techniques.
  • Figure 4: Relationship between informational quality of individual words (information gain; IG) and excess reading time. Solid blue lines are smoothed GAM fits; shaded regions show bootstrapped 95% confidence intervals. Red tick marks along the bottom (rug plots) indicate the distribution of IG data points. Reading times are aligned to end at zero at the highest MI end to emphasize the relative excess reading time when information quality decreases.
  • Figure 5: Self-rated ease of reading across visibility conditions. Participants were asked to judge whether the upper or lower half of words was easier to read.