Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors
Kohei Tsuji, Tatsuya Hiraoka, Yuchang Cheng, Eiji Aramaki, Tomoya Iwakura
TL;DR
The paper investigates how Transformer-based LLMs handle typos by identifying typo-specific neurons and attention heads that support typo-fixing through local and global contexts. It introduces a rigorous method and two data pipelines, $\Delta_n$ and $\Delta_h$, to isolate typo-related activations and attention behavior across multiple models, revealing distinct roles for early/late-layer neurons (local context) and middle-layer neurons (global context), as well as broad-context typo heads. Ablation analyses show these components contribute not only to typo correction but also to general grammatical and contextual understanding, with model-size effects shaping the reliance on typo heads. These findings offer mechanistic insights that could guide robustness improvements by reinforcing both local and global contextual processing and language-structure awareness in LLMs.
Abstract
This paper investigates how LLMs encode inputs with typos. We hypothesize that specific neurons and attention heads recognize typos and fix them internally using local and global contexts. We introduce a method to identify typo neurons and typo heads that work actively when inputs contain typos. Our experimental results suggest the following: 1) LLMs can fix typos with local contexts when the typo neurons in either the early or late layers are activated, even if those in the other are not. 2) Typo neurons in the middle layers are responsible for the core of typo-fixing with global contexts. 3) Typo heads fix typos by widely considering the context not focusing on specific tokens. 4) Typo neurons and typo heads work not only for typo-fixing but also for understanding general contexts.
