Table of Contents
Fetching ...

How does a Multilingual LM Handle Multiple Languages?

Santhosh Kakarla, Gautama Shastry Bulusu Venkata, Aishwarya Gaddam

TL;DR

The paper investigates how multilingual language models represent and transfer knowledge across languages through three strands: semantic analysis via multilingual word embeddings, internal representation probing of BLOOM-1.7B and QWEN-2, and cross-lingual transfer using BLOOM-560m and BERT-base Multilingual Cased on XNLI. It reveals that related languages align well in embedding spaces, while distant languages like Chinese show divergence; probing shows deeper layers specialize on tasks and that QWEN maintains cross-lingual alignment more robustly than BLOOM in many cases. Across cross-lingual transfer, BLOOM-560m generally outperforms BERT multilingual, yet performance gaps persist for Swahili and other low-resource languages, highlighting the need for richer data and architectural enhancements. The findings inform strategies for creating more inclusive multilingual NLP systems, including data augmentation, stronger fine-tuning, and targeted architectural improvements to support low-resource languages.

Abstract

Multilingual language models have significantly advanced due to rapid progress in natural language processing. Models like BLOOM 1.7B, trained on diverse multilingual datasets, aim to bridge linguistic gaps. However, their effectiveness in capturing linguistic knowledge, particularly for low-resource languages, remains an open question. This study critically examines MLMs capabilities in multilingual understanding, semantic representation, and cross-lingual knowledge transfer. While these models perform well for high-resource languages, they struggle with less-represented ones. Additionally, traditional evaluation methods often overlook their internal syntactic and semantic encoding. This research addresses key limitations through three objectives. First, it assesses semantic similarity by analyzing multilingual word embeddings for consistency using cosine similarity. Second, it examines BLOOM-1.7B and Qwen2 through Named Entity Recognition and sentence similarity tasks to understand their linguistic structures. Third, it explores cross-lingual knowledge transfer by evaluating generalization from high-resource to low-resource languages in sentiment analysis and text classification. By leveraging linguistic probing, performance metrics, and visualizations, this study provides insights into the strengths and limitations of MLMs. The findings aim to enhance multilingual NLP models, ensuring better support for both high- and low-resource languages, thereby promoting inclusivity in language technologies.

How does a Multilingual LM Handle Multiple Languages?

TL;DR

The paper investigates how multilingual language models represent and transfer knowledge across languages through three strands: semantic analysis via multilingual word embeddings, internal representation probing of BLOOM-1.7B and QWEN-2, and cross-lingual transfer using BLOOM-560m and BERT-base Multilingual Cased on XNLI. It reveals that related languages align well in embedding spaces, while distant languages like Chinese show divergence; probing shows deeper layers specialize on tasks and that QWEN maintains cross-lingual alignment more robustly than BLOOM in many cases. Across cross-lingual transfer, BLOOM-560m generally outperforms BERT multilingual, yet performance gaps persist for Swahili and other low-resource languages, highlighting the need for richer data and architectural enhancements. The findings inform strategies for creating more inclusive multilingual NLP systems, including data augmentation, stronger fine-tuning, and targeted architectural improvements to support low-resource languages.

Abstract

Multilingual language models have significantly advanced due to rapid progress in natural language processing. Models like BLOOM 1.7B, trained on diverse multilingual datasets, aim to bridge linguistic gaps. However, their effectiveness in capturing linguistic knowledge, particularly for low-resource languages, remains an open question. This study critically examines MLMs capabilities in multilingual understanding, semantic representation, and cross-lingual knowledge transfer. While these models perform well for high-resource languages, they struggle with less-represented ones. Additionally, traditional evaluation methods often overlook their internal syntactic and semantic encoding. This research addresses key limitations through three objectives. First, it assesses semantic similarity by analyzing multilingual word embeddings for consistency using cosine similarity. Second, it examines BLOOM-1.7B and Qwen2 through Named Entity Recognition and sentence similarity tasks to understand their linguistic structures. Third, it explores cross-lingual knowledge transfer by evaluating generalization from high-resource to low-resource languages in sentiment analysis and text classification. By leveraging linguistic probing, performance metrics, and visualizations, this study provides insights into the strengths and limitations of MLMs. The findings aim to enhance multilingual NLP models, ensuring better support for both high- and low-resource languages, thereby promoting inclusivity in language technologies.

Paper Structure

This paper contains 13 sections, 8 figures.

Figures (8)

  • Figure 1: Cosine similarity between English and translated words across four languages (French, German, Spanish, and Chinese). The Chinese language exhibits notably lower cosine similarity compared to others.
  • Figure 2: t-SNE visualization of word embeddings across five languages: English, French, German, Spanish, and Chinese. Related languages form overlapping clusters, while Chinese forms a distinct, isolated cluster.
  • Figure 3: Sentence similarity of English to other languages in the BLOOM model. Learning performance decreases as layers increase.
  • Figure 4: Sentence similarity of English to other languages in the QWEN model. Arabic shows improved performance compared to the BLOOM model.
  • Figure 5: Hidden state analysis across layers in the BLOOM model. The hidden state value decreases sharply in the deeper layers, indicating degradation of useful representations.
  • ...and 3 more figures