Table of Contents
Fetching ...

Reasoning Transfer for an Extremely Low-Resource and Endangered Language: Bridging Languages Through Sample-Efficient Language Understanding

Khanh-Tung Tran, Barry O'Sullivan, Hoang D. Nguyen

TL;DR

The paper tackles reasoning transfer to extremely low-resource languages by introducing English-Pivoted CoT Training, which constrains chain-of-thought traces to English while keeping inputs and outputs in the target language. By decomposing the objective into English-CoT generation and target-language final answer generation, the method demonstrates robust cross-lingual reasoning with limited data, supported by the new Irish LC2024 benchmark. Key findings include substantial gains over baselines (up to 28.33 percentage points on Irish AIME2024 and 73.33% on LC2024) and evidence that separating language understanding from reasoning improves cross-lingual transfer; ablations reveal the approach’s effectiveness across low-, medium-, and high-resource languages, with varying generalizability. The work offers a practical pathway for multilingual reasoning without extensive retraining per language and contributes a valuable dataset for evaluating mathematical reasoning in Irish.

Abstract

Recent advances have enabled Large Language Models (LLMs) to tackle reasoning tasks by generating chain-of-thought (CoT) rationales, yet these gains have largely applied to high-resource languages, leaving low-resource languages behind. In this work, we first investigate CoT techniques in extremely low-resource scenarios through previous prompting, model-editing, and fine-tuning approaches. We introduce English-Pivoted CoT Training, leveraging the insight that LLMs internally operate in a latent space aligned toward the dominant language. Given input in a low-resource language, we perform supervised fine-tuning to generate CoT in English and output the final response in the target language. Across mathematical reasoning benchmarks, our approach outperforms other baselines with up to 28.33% improvement in low-resource scenarios. Our analysis and additional experiments, including Mixed-Language CoT and Two-Stage Training, show that explicitly separating language understanding from reasoning enhances cross-lingual reasoning abilities. To facilitate future work, we also release \emph{LC2024}, the first benchmark for mathematical tasks in Irish, an extremely low-resource and endangered language. Our results and resources highlight a practical pathway to multilingual reasoning without extensive retraining in every extremely low-resource language, despite data scarcity.

Reasoning Transfer for an Extremely Low-Resource and Endangered Language: Bridging Languages Through Sample-Efficient Language Understanding

TL;DR

The paper tackles reasoning transfer to extremely low-resource languages by introducing English-Pivoted CoT Training, which constrains chain-of-thought traces to English while keeping inputs and outputs in the target language. By decomposing the objective into English-CoT generation and target-language final answer generation, the method demonstrates robust cross-lingual reasoning with limited data, supported by the new Irish LC2024 benchmark. Key findings include substantial gains over baselines (up to 28.33 percentage points on Irish AIME2024 and 73.33% on LC2024) and evidence that separating language understanding from reasoning improves cross-lingual transfer; ablations reveal the approach’s effectiveness across low-, medium-, and high-resource languages, with varying generalizability. The work offers a practical pathway for multilingual reasoning without extensive retraining per language and contributes a valuable dataset for evaluating mathematical reasoning in Irish.

Abstract

Recent advances have enabled Large Language Models (LLMs) to tackle reasoning tasks by generating chain-of-thought (CoT) rationales, yet these gains have largely applied to high-resource languages, leaving low-resource languages behind. In this work, we first investigate CoT techniques in extremely low-resource scenarios through previous prompting, model-editing, and fine-tuning approaches. We introduce English-Pivoted CoT Training, leveraging the insight that LLMs internally operate in a latent space aligned toward the dominant language. Given input in a low-resource language, we perform supervised fine-tuning to generate CoT in English and output the final response in the target language. Across mathematical reasoning benchmarks, our approach outperforms other baselines with up to 28.33% improvement in low-resource scenarios. Our analysis and additional experiments, including Mixed-Language CoT and Two-Stage Training, show that explicitly separating language understanding from reasoning enhances cross-lingual reasoning abilities. To facilitate future work, we also release \emph{LC2024}, the first benchmark for mathematical tasks in Irish, an extremely low-resource and endangered language. Our results and resources highlight a practical pathway to multilingual reasoning without extensive retraining in every extremely low-resource language, despite data scarcity.

Paper Structure

This paper contains 9 sections, 2 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustrative example of model behavior (r1-distill-Llama-8B) when prompted with the same problem in English (robust reasoning) versus a low-resource language - Irish (reduced understanding and conversational ability). Training with an Irish chain-of-thought diverges from baseline, while training with English chain-of-thoughts achieves the best of both worlds.
  • Figure 2: Loss curves over the training process for Native CoT Training (Left) and our approach, English-Pivoted CoT Training (right), normalized with exponential moving average with a smoothing weight of 0.95. Our method shows lower initial reasoning loss and slower decline, indicating effective separation of English reasoning from target-language responses.
  • Figure 3: Parameter updates (mean absolute differences) of Native CoT Training (Left) and English-Pivoted CoT Training (right) for r1-distill-Llama-8B. English-Pivoted CoT Training focuses more on language comprehension layers (red boxes).
  • Figure 4: Representation retrieval accuracy between Left: questions, and Right: questions and generated CoT traces of the same questions in different languages.
  • Figure 5: Performance of English-Pivoted CoT training on Irish AIME and LC2024 with varying number of training samples on r1-distill-Llama-8B (default is $N$).
  • ...and 2 more figures