Table of Contents
Fetching ...

Unlocking Multilingual Reasoning Capability of LLMs and LVLMs through Representation Engineering

Qiming Li, Xiaocheng Feng, Yixuan Ma, Zekai Ye, Ruihan Chen, Xiachong Feng, Bing Qin

TL;DR

MRRE addresses multilingual fairness gaps in reasoning by enabling training-free inference-time enhancement. It uses two precomputed vectors—cross-lingual reasoning enhancement and target-language output anchoring—applied at carefully chosen layers to induce English-like reasoning for non-English inputs and then restore target-language outputs. Across six models and four benchmarks, MRRE yields average gains of about 5.5% in non-English reasoning and improved language consistency, with larger gains in Thai and Swahili, and demonstrates cross-modal and cross-dataset generalization. This approach offers a practical, model-agnostic method to improve multilingual reasoning without data or translation tools, with implications for fair, scalable deployment of LLMs/LVLMs.

Abstract

Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) demonstrate strong reasoning capabilities, yet their performance in English significantly outperforms that in low-resource languages, raising fairness concerns in multilingual applications. Existing approaches either rely on costly multilingual training or employ prompting with external translation tools, both of which are resource-intensive and sensitive to translation quality. To address these limitations, we propose a training-free inference-time method to enhance Multilingual Reasoning capabilities via Representation Engineering (MRRE) without using any additional training data or tools. MRRE sequentially injects two precomputed vectors at specific layers during inference processing: cross-lingual reasoning enhancement vectors, which steer non-English reasoning representations toward English space to unlock multilingual reasoning, and target-language output anchoring vectors, which restore the distribution of the target language to preserve input-output language consistency. Comprehensive experiments across six advanced LLMs and LVLMs on four reasoning benchmarks demonstrate that MRRE consistently enhances non-English reasoning by an average gain of 5.48% and up to 7.54% in low-resource languages (Thai and Swahili), while improving input-output language consistency by 3.78%.

Unlocking Multilingual Reasoning Capability of LLMs and LVLMs through Representation Engineering

TL;DR

MRRE addresses multilingual fairness gaps in reasoning by enabling training-free inference-time enhancement. It uses two precomputed vectors—cross-lingual reasoning enhancement and target-language output anchoring—applied at carefully chosen layers to induce English-like reasoning for non-English inputs and then restore target-language outputs. Across six models and four benchmarks, MRRE yields average gains of about 5.5% in non-English reasoning and improved language consistency, with larger gains in Thai and Swahili, and demonstrates cross-modal and cross-dataset generalization. This approach offers a practical, model-agnostic method to improve multilingual reasoning without data or translation tools, with implications for fair, scalable deployment of LLMs/LVLMs.

Abstract

Large Language Models (LLMs) and Large Vision-Language Models (LVLMs) demonstrate strong reasoning capabilities, yet their performance in English significantly outperforms that in low-resource languages, raising fairness concerns in multilingual applications. Existing approaches either rely on costly multilingual training or employ prompting with external translation tools, both of which are resource-intensive and sensitive to translation quality. To address these limitations, we propose a training-free inference-time method to enhance Multilingual Reasoning capabilities via Representation Engineering (MRRE) without using any additional training data or tools. MRRE sequentially injects two precomputed vectors at specific layers during inference processing: cross-lingual reasoning enhancement vectors, which steer non-English reasoning representations toward English space to unlock multilingual reasoning, and target-language output anchoring vectors, which restore the distribution of the target language to preserve input-output language consistency. Comprehensive experiments across six advanced LLMs and LVLMs on four reasoning benchmarks demonstrate that MRRE consistently enhances non-English reasoning by an average gain of 5.48% and up to 7.54% in low-resource languages (Thai and Swahili), while improving input-output language consistency by 3.78%.

Paper Structure

This paper contains 38 sections, 7 equations, 12 figures, 13 tables.

Figures (12)

  • Figure 1: MRRE adopts a two-stage intervention strategy to unlock multilingual reasoning capabilities.
  • Figure 2: t-SNE hidden state visualization and reasoning performance of Qwen2.5-7B-Instruct and Qwen2.5-VL-7B-Instruct. The reasoning capability in English exhibits substantially stronger than in other languages.
  • Figure 3: An overview of our proposed MRRE method. Each rectangle represents the model's hidden state during the forward passing. MRRE consists of three key stages: a) Cross-Lingual Reasoning Enhancement Vectors §\ref{['reasoning']} are derived from the hidden state differences between English and non-English reasoning responses. b) Target-Language Output Anchoring Vectors §\ref{['anchoring']} are derived from the hidden state differences between non-English and English language forcing prompts. c) Hierarchical Inference-Time Intervention §\ref{['intervention']}: Precomputed vectors are sequentially injected into the last-token representations at specific layers during forward passing, thereby enhancing non-English reasoning capabilities while preserving input-output language consistency.
  • Figure 4: Kernel Density Estimate (KDE) visualization plots of cross-lingual hidden states within Qwen2.5-7B-Instruct before and after two types of intervention. The x-axis represents the SVM-derived signed distance to the mean English representation; and the y-axis represents the estimated probability density.
  • Figure 5: Case study of Qwen2.5-VL-7B-Instruct on the MathVerse benchmark.
  • ...and 7 more figures