Investigating Language Preference of Multilingual RAG Systems
Jeonghyun Park, Hwanhee Lee
TL;DR
This work systematically analyzes language preference in multilingual RAG systems, uncovering retriever bias toward high-resource and query languages and generator bias toward the query language or Latin-script languages. It introduces MultiLingualRankShift (MLRS) to quantify retriever language bias and demonstrates how language choices impact retrieval quality and end-task performance. To mitigate these biases, the authors propose Dual Knowledge Multilingual RAG (DKM-RAG), which fuses externally translated passages with internally rewritten, model-enhanced content, yielding improved performance across diverse linguistic settings. The study reveals that language-resource distribution and translation quality significantly shape mRAG behavior, and it proposes language-aware strategies (e.g., translating to $L_q$ for non-English queries) to optimize results. Collectively, the findings highlight practical considerations for deploying robust, multilingual RAG systems and point to DKM-RAG as a simple yet effective bias-mitigation framework.
Abstract
Multilingual Retrieval-Augmented Generation (mRAG) systems enhance language models by integrating external multilingual information to produce context-aware responses. However, mRAG systems struggle with retrieving relevant information due to linguistic variations between queries and documents, generating inconsistent responses when multilingual sources conflict. In this work, we systematically investigate language preferences in both retrieval and generation of mRAG through a series of experiments. Our analysis indicates that retrievers tend to prefer high-resource and query languages, yet this preference does not consistently improve generation performance. Moreover, we observe that generators prefer the query language or Latin scripts, leading to inconsistent outputs. To overcome these issues, we propose Dual Knowledge Multilingual RAG (DKM-RAG), a simple yet effective framework that fuses translated multilingual passages with complementary model knowledge. Empirical results demonstrate that DKM-RAG mitigates language preference in generation and enhances performance across diverse linguistic settings. Code is available at https://github.com/jeonghyunpark2002/LanguagePreference.git
