Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Shimao Zhang; Changjiang Gao; Wenhao Zhu; Jiajun Chen; Xin Huang; Xue Han; Junlan Feng; Chao Deng; Shujian Huang

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Shimao Zhang, Changjiang Gao, Wenhao Zhu, Jiajun Chen, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang

TL;DR

This work investigates spontaneous multilingual alignment in large language models by training solely on parallel question translation data without annotated answers. Using LoRA-based instruction-tuning and in-context learning, the authors demonstrate significant multilingual transfer improvements across 20 languages and multiple tasks, even for languages unseen during training, challenging the necessity of annotated answers. Mechanistic interpretability methods (logit lens and PCA) reveal latent representations that bridge languages and reveal improved alignment after training. The results suggest that multilingual alignment can be achieved efficiently with limited parallel data, highlighting practical potential for broad language generalization in LLMs.

Abstract

Recently, Large Language Models (LLMs) have shown impressive language capabilities. While most of the existing LLMs have very unbalanced performance across different languages, multilingual alignment based on translation parallel data is an effective method to enhance the LLMs' multilingual capabilities. In this work, we discover and comprehensively investigate the spontaneous multilingual alignment improvement of LLMs. We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages, even including those unseen during instruction-tuning. Additionally, we utilize different settings and mechanistic interpretability methods to analyze the LLM's performance in the multilingual scenario comprehensively. Our work suggests that LLMs have enormous potential for improving multilingual alignment efficiently with great language and task generalization.

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

TL;DR

Abstract

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Authors

TL;DR

Abstract

Table of Contents

Figures (28)