Language Specific Knowledge: Do Models Know Better in X than in English?

Ishika Agarwal; Nimet Beyza Bozdag; Dilek Hakkani-Tür

Language Specific Knowledge: Do Models Know Better in X than in English?

Ishika Agarwal, Nimet Beyza Bozdag, Dilek Hakkani-Tür

TL;DR

Challenging the latent language alignment hypothesis, the paper defines Language Specific Knowledge (LSK) as knowledge best accessed in an expert language for a given LLM and proposes LSKExtractor to map topics to expert languages and exploit this mapping during inference. The framework operates in two stages: (i) map training queries into semantic clusters and assign an expert language per cluster based on cross-language chain-of-thought performance, and (ii) at test time, identify the closest cluster for a new query and perform reasoning in the cluster’s expert language. Evaluations across CultureAtlas, BLEnD, and SocialIQa with multiple instruction-tuned models show up to 10% relative improvements and competitive baselines, demonstrating transferability of learned LSK maps across models and datasets. The work advocates for inclusive, culturally aware multilingual reasoning and provides open-source tooling to facilitate practical adoption and further research.

Abstract

Often, multilingual language models are trained with the objective to map semantically similar content (in different languages) in the same latent space. In this paper, we show a nuance in this training objective, and find that by changing the language of the input query, we can improve the question answering ability of language models. Our contributions are two-fold. First, we introduce the term Language Specific Knowledge (LSK) to denote queries that are best answered in an "expert language" for a given LLM, thereby enhancing its question-answering ability. We introduce the problem of language selection -- for some queries, language models can perform better when queried in languages other than English, sometimes even better in low-resource languages -- and the goal is to select the optimal language for the query. Second, we introduce simple to strong baselines to test this problem. Additionally, as a first-pass solution to this novel problem, we design LSKExtractor to benchmark the language-specific knowledge present in a language model and then exploit it during inference. To test our framework, we employ three datasets that contain knowledge about both cultural and social behavioral norms. Overall, LSKExtractor achieves up to 10% relative improvement across datasets, and is competitive against strong baselines, while being feasible in real-world settings. Broadly, our research contributes to the open-source development (https://github.com/agarwalishika/LSKExtractor/tree/main) of language models that are inclusive and more aligned with the cultural and linguistic contexts in which they are deployed.

Language Specific Knowledge: Do Models Know Better in X than in English?

TL;DR

Abstract

Language Specific Knowledge: Do Models Know Better in X than in English?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (18)