Concept Space Alignment in Multilingual LLMs

Qiwei Peng; Anders Søgaard

Concept Space Alignment in Multilingual LLMs

Qiwei Peng, Anders Søgaard

TL;DR

The experiments show that multilingual LLMs suffer from two familiar weaknesses: generalization works best for languages with similar typology, and for abstract concepts, and for abstract concepts.

Abstract

Multilingual large language models (LLMs) seem to generalize somewhat across languages. We hypothesize this is a result of implicit vector space alignment. Evaluating such alignment, we see that larger models exhibit very high-quality linear alignments between corresponding concepts in different languages. Our experiments show that multilingual LLMs suffer from two familiar weaknesses: generalization works best for languages with similar typology, and for abstract concepts. For some models, e.g., the Llama-2 family of models, prompt-based embeddings align better than word embeddings, but the projections are less linear -- an observation that holds across almost all model families, indicating that some of the implicitly learned alignments are broken somewhat by prompt-based methods.

Concept Space Alignment in Multilingual LLMs

TL;DR

The experiments show that multilingual LLMs suffer from two familiar weaknesses: generalization works best for languages with similar typology, and for abstract concepts, and for abstract concepts.

Abstract

Paper Structure (16 sections, 1 equation, 3 figures, 24 tables)

This paper contains 16 sections, 1 equation, 3 figures, 24 tables.

Introduction
Contributions
Experiments
Concepts
LLMs
Alignment and Retrieval
Main Results
Abstract vs. Physical
Discussion and Related Work
Related Work
Linear Alignment
Difference in Languages
Types of Concepts
Conclusion
Language Resource and Shared Vocabulary
...and 1 more sections

Figures (3)

Figure 1: Examples of four parallel WordNet concepts, aligned across 7 languages.
Figure 2: Performance (P@1) of different LLMs on the concept alignment evaluation when using a seed dictionary of 3,000 concepts. X-axis: Languages, we further divide these languages into three groups, where Group 1 is Indo-European, Group 2 includes languages that are not Indo-European but still in Latin script, while Group 3 refers to languages that are not Indo-European and not in Latin script. Y-axis: We report Precision@1.
Figure 3: Performance (P@1) of different LLMs on the concept alignment evaluation when using a seed dictionary of 3000 pairs. X-axis: Languages, we further divide these languages into three groups, where Group 1 is Indo-European, Group 2 includes languages that are not Indo-European but still in Latin script, while Group 3 refers to languages that are not Indo-European and not in Latin script. Y-axis: We report Precision@1.

Concept Space Alignment in Multilingual LLMs

TL;DR

Abstract

Concept Space Alignment in Multilingual LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (3)