Inducing Dyslexia in Vision Language Models
Melika Honarmand, Ayati Sharma, Badr AlKhamissi, Johannes Mehrer, Martin Schrimpf
TL;DR
This work models dyslexia by functionally identifying and ablating visual-word-form-selective units in a vision-language model, producing reading-specific impairments while leaving general reasoning intact. Using benchmarks designed for humans (ROAR, Raven, Kempler, and orthography/phonology lexical tasks), the authors show that targeted VWFA-like perturbations yield phonological deficits without orthographic disruptions, aligning with human dyslexia patterns. The study introduces a minimal subnetwork (~$6.89$ percent of units) whose ablation reproduces key dyslexic features, and interprets the results through the lens of VWFA functionality, providing a causal, mechanistic framework for studying reading disorders. Beyond dyslexia, the paper presents a generalizable methodology for using functional localization and targeted perturbations in VLMs as digital twins to explore brain disorders and develop hypothesis-driven interventions.
Abstract
Dyslexia, a neurodevelopmental disorder characterized by persistent reading difficulties, is often linked to reduced activity of the visual word form area in the ventral occipito-temporal cortex. Traditional approaches to studying dyslexia, such as behavioral and neuroimaging methods, have provided valuable insights but remain limited in their ability to test causal hypotheses about the underlying mechanisms of reading impairments. In this study, we use large-scale vision-language models (VLMs) to simulate dyslexia by functionally identifying and perturbing artificial analogues of word processing. Using stimuli from cognitive neuroscience, we identify visual-word-form-selective units within VLMs and demonstrate that targeted ablation of these units, unlike ablation of random units, leads to selective impairments in reading tasks while general visual and language comprehension abilities remain intact. In particular, the resulting model matches dyslexic humans' phonological deficits without a significant change in orthographic processing. Taken together, our modeling results replicate key characteristics of dyslexia and establish a computational framework for investigating reading disorders.
