Table of Contents
Fetching ...

A Morphology-Based Investigation of Positional Encodings

Poulami Ghosh, Shikhar Vashishth, Raj Dabre, Pushpak Bhattacharyya

TL;DR

The paper addresses whether positional encodings in transformer-based models are more or less important depending on a language's morphological complexity. It ablates positional encodings during fine-tuning across 22 languages and 5 tasks, using a type-token ratio proxy and Flores-200 alignment to quantify morphology. The results show a negative relationship between morphological richness and reliance on PE, with analytic languages exhibiting larger performance drops than morphologically rich languages, suggesting the need for morphology-aware PE designs. This work advances cross-language understanding of PE function and motivates developing encodings that better reflect linguistic diversity for broader multilingual applicability.

Abstract

Contemporary deep learning models effectively handle languages with diverse morphology despite not being directly integrated into them. Morphology and word order are closely linked, with the latter incorporated into transformer-based models through positional encodings. This prompts a fundamental inquiry: Is there a correlation between the morphological complexity of a language and the utilization of positional encoding in pre-trained language models? In pursuit of an answer, we present the first study addressing this question, encompassing 22 languages and 5 downstream tasks. Our findings reveal that the importance of positional encoding diminishes with increasing morphological complexity in languages. Our study motivates the need for a deeper understanding of positional encoding, augmenting them to better reflect the different languages under consideration.

A Morphology-Based Investigation of Positional Encodings

TL;DR

The paper addresses whether positional encodings in transformer-based models are more or less important depending on a language's morphological complexity. It ablates positional encodings during fine-tuning across 22 languages and 5 tasks, using a type-token ratio proxy and Flores-200 alignment to quantify morphology. The results show a negative relationship between morphological richness and reliance on PE, with analytic languages exhibiting larger performance drops than morphologically rich languages, suggesting the need for morphology-aware PE designs. This work advances cross-language understanding of PE function and motivates developing encodings that better reflect linguistic diversity for broader multilingual applicability.

Abstract

Contemporary deep learning models effectively handle languages with diverse morphology despite not being directly integrated into them. Morphology and word order are closely linked, with the latter incorporated into transformer-based models through positional encodings. This prompts a fundamental inquiry: Is there a correlation between the morphological complexity of a language and the utilization of positional encoding in pre-trained language models? In pursuit of an answer, we present the first study addressing this question, encompassing 22 languages and 5 downstream tasks. Our findings reveal that the importance of positional encoding diminishes with increasing morphological complexity in languages. Our study motivates the need for a deeper understanding of positional encoding, augmenting them to better reflect the different languages under consideration.
Paper Structure (24 sections, 7 figures, 6 tables)

This paper contains 24 sections, 7 figures, 6 tables.

Figures (7)

  • Figure 1: The figure illustrates the effect of word order on semantics for two languages: English (left) and Sanskrit (right). English is a morphologically poor language with SVO word order whereas Sanskrit is a morphologically rich language with no dominant word order (NODOM). Distorting the word order completely alters the meaning for English. However, for Sanskrit the meaning remains intact.
  • Figure 2: Effect of Positional Encoding on NER task.
  • Figure 3: Effect of Positional Encoding on POS task.
  • Figure 4: Effect of Positional Encoding on Dependency Parsing.
  • Figure 5: Effect of Positional Encoding on XNLI.
  • ...and 2 more figures