Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges
Yangfan Ye, Xiaocheng Feng, Xiachong Feng, Libo Qin, Yichong Huang, Lei Huang, Weitao Ma, Qichen Hong, Zhirui Zhang, Yunfei Lu, Xiaohui Yan, Duyu Tang, Dandan Tu, Bing Qin
TL;DR
This work tackles the imbalance in multilingual capabilities and cultural adaptability in English-centric LLMs by introducing XTransplant, a framework that performs cross-lingual latent transplantation during inference. By transplanting latent activations across languages at decoder layers, the method aims to combine English strengths with non-English knowledge, revealing distinct roles for attention and feed-forward modules in multilingual understanding and culture-specific knowledge capture. Extensive analyses across multiple models, languages, cultures, and granularities show that Attn-level transplantation most benefits multilingual tasks while FFN-level transplantation better supports culture-related understanding, with substantial upper-bound potential beyond vanilla performance. The findings emphasize the existence of underutilized multilingual potential in current LLMs and highlight the need for dynamic, instance-aware layer selection strategies to approach the identified upper bound, offering a new direction for cross-lingual interaction research.
Abstract
Current large language models (LLMs) often exhibit imbalances in multilingual capabilities and cultural adaptability, largely attributed to their English-centric pre-training data. In this paper, we introduce and investigate a cross-lingual latent transplantation (XTransplant) framework, which aims to further exploit the model's internalized multilingual knowledge during inference and examine its effects on the multilingual capability and cultural adaptability of LLMs. XTransplant framework enables models to harness the complementary strengths of both English and non-English resources by transplanting latent activations across languages. Through extensive analysis, we empirically demonstrate that XTransplant, a form of cross-lingual interaction, has mutually beneficial effects on the multilingual capability and cultural adaptability of LLMs, particularly for low-resource languages and cultures. We further reveal that attention modules play a pivotal role in supporting multilingual understanding, while feed-forward modules are more adept at capturing culture-specific knowledge. In addition, we conduct in-depth analysis of XTransplant's stability, effectiveness, and generalizability. By probing the upper bound performance of XTransplant, we expose the considerable underutilization of current LLMs' multilingual potential-a challenge that remains open. We hope our analysis offers a new lens for advancing cross-lingual interactions and better leveraging models' internalized multilingual knowledge.
