The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling
Ruochen Zhang, Qinan Yu, Matianyu Zang, Carsten Eickhoff, Ellie Pavlick
TL;DR
The paper investigates whether large language models encode linguistic structure in shared internal circuits across languages. By applying mechanistic interpretability methods (path patching and information flow routes) to IOI and past-tense tasks in English and Chinese across multilingual and monolingual models, it finds that common syntactic processes are handled by largely shared circuits, while language-specific morphology engages specialized components such as FFNs in English. This reveals a trade-off where models leverage universal processing patterns yet preserve language-specific differences, informing cross-lingual transfer and multilingual model design. The work provides a principled framework for dissecting multilingual internal representations and points to future research on exploiting circuit overlap to improve multilingual robustness and safety.
Abstract
We employ new tools from mechanistic interpretability in order to ask whether the internal structure of large language models (LLMs) shows correspondence to the linguistic structures which underlie the languages on which they are trained. In particular, we ask (1) when two languages employ the same morphosyntactic processes, do LLMs handle them using shared internal circuitry? and (2) when two languages require different morphosyntactic processes, do LLMs handle them using different internal circuitry? Using English and Chinese multilingual and monolingual models, we analyze the internal circuitry involved in two tasks. We find evidence that models employ the same circuit to handle the same syntactic process independently of the language in which it occurs, and that this is the case even for monolingual models trained completely independently. Moreover, we show that multilingual models employ language-specific components (attention heads and feed-forward networks) when needed to handle linguistic processes (e.g., morphological marking) that only exist in some languages. Together, our results provide new insights into how LLMs trade off between exploiting common structures and preserving linguistic differences when tasked with modeling multiple languages simultaneously.
