Table of Contents
Fetching ...

Do Multilingual LLMs Think In English?

Lisa Schut, Yarin Gal, Sebastian Farquhar

TL;DR

<3-5 sentence high-level summary>This work probes whether multilingual LLMs truly reason in a universal, language-agnostic latent space or rely on an English-centric representation. Using the Logit Lens, causal tracing, and steering vectors across several open-source models and languages, it demonstrates that lexical content is often processed in an English-adjacent space, while non-lexical elements align more with the input language. It shows that English-derived steering vectors can more effectively influence outputs than vectors from the target language, and that cross-language facts share similar latent regions though they are frequently decoded in English. These findings have implications for fairness, safety, and robustness of multilingual LLMs and suggest care when deploying such models in non-English linguistic environments.

Abstract

Large language models (LLMs) have multilingual capabilities and can solve tasks across various languages. However, we show that current LLMs make key decisions in a representation space closest to English, regardless of their input and output languages. Exploring the internal representations with a logit lens for sentences in French, German, Dutch, and Mandarin, we show that the LLM first emits representations close to English for semantically-loaded words before translating them into the target language. We further show that activation steering in these LLMs is more effective when the steering vectors are computed in English rather than in the language of the inputs and outputs. This suggests that multilingual LLMs perform key reasoning steps in a representation that is heavily shaped by English in a way that is not transparent to system users.

Do Multilingual LLMs Think In English?

TL;DR

<3-5 sentence high-level summary>This work probes whether multilingual LLMs truly reason in a universal, language-agnostic latent space or rely on an English-centric representation. Using the Logit Lens, causal tracing, and steering vectors across several open-source models and languages, it demonstrates that lexical content is often processed in an English-adjacent space, while non-lexical elements align more with the input language. It shows that English-derived steering vectors can more effectively influence outputs than vectors from the target language, and that cross-language facts share similar latent regions though they are frequently decoded in English. These findings have implications for fairness, safety, and robustness of multilingual LLMs and suggest care when deploying such models in non-English linguistic environments.

Abstract

Large language models (LLMs) have multilingual capabilities and can solve tasks across various languages. However, we show that current LLMs make key decisions in a representation space closest to English, regardless of their input and output languages. Exploring the internal representations with a logit lens for sentences in French, German, Dutch, and Mandarin, we show that the LLM first emits representations close to English for semantically-loaded words before translating them into the target language. We further show that activation steering in these LLMs is more effective when the steering vectors are computed in English rather than in the language of the inputs and outputs. This suggests that multilingual LLMs perform key reasoning steps in a representation that is heavily shaped by English in a way that is not transparent to system users.

Paper Structure

This paper contains 51 sections, 7 equations, 63 figures, 11 tables.

Figures (63)

  • Figure 1: Logit lens applied to Llama-3.1-70B's latent space, when prompted with Le bateau naviguait en douceur sur l'. Each row depicts the decoded latent representations for one layer and each column corresponds to the generated token. Dark red boxes highlight words selected in English. The nouns 'eau', 'lac', and 'soleil' are selected in English, whereas other parts of speech are not.
  • Figure 2: Aya-23-35B
  • Figure 3: Llama-3.1-70B
  • Figure 4: Mixtral-8x22B
  • Figure 5: Gemma-2-27b
  • ...and 58 more figures