Table of Contents
Fetching ...

MultiSlav: Using Cross-Lingual Knowledge Transfer to Combat the Curse of Multilinguality

Artur Kot, Mikołaj Koszowski, Wojciech Chojnowski, Mieszko Rutkowski, Artur Nowakowski, Kamil Guttmann, Mikołaj Pokrywka

TL;DR

The paper investigates whether expanding multilingual NMT data yields beneficial cross-lingual knowledge transfer or the Curse of Multilinguality for Slavic languages. By evaluating Bi-Directional, Pivot, and Multilingual models, including English as a bridge, the study demonstrates cross-lingual gains within the Slavic family and notable improvements from multilingual training, especially with the MultiSlav+ENG configuration. While commercial LLMs deliver top scores, open-source approaches remain competitive, with MultiSlav+ENG providing the most robust gains among freely available models. The work releases open-source Slavic NMT models on HuggingFace, highlighting practical implications for translating among Czech, Polish, Slovak, Slovene, and English, and offering a foundation for broader cross-lingual transfer research in morphologically rich languages.

Abstract

Does multilingual Neural Machine Translation (NMT) lead to The Curse of the Multlinguality or provides the Cross-lingual Knowledge Transfer within a language family? In this study, we explore multiple approaches for extending the available data-regime in NMT and we prove cross-lingual benefits even in 0-shot translation regime for low-resource languages. With this paper, we provide state-of-the-art open-source NMT models for translating between selected Slavic languages. We released our models on the HuggingFace Hub (https://hf.co/collections/allegro/multislav-6793d6b6419e5963e759a683) under the CC BY 4.0 license. Slavic language family comprises morphologically rich Central and Eastern European languages. Although counting hundreds of millions of native speakers, Slavic Neural Machine Translation is under-studied in our opinion. Recently, most NMT research focuses either on: high-resource languages like English, Spanish, and German - in WMT23 General Translation Task 7 out of 8 task directions are from or to English; massively multilingual models covering multiple language groups; or evaluation techniques.

MultiSlav: Using Cross-Lingual Knowledge Transfer to Combat the Curse of Multilinguality

TL;DR

The paper investigates whether expanding multilingual NMT data yields beneficial cross-lingual knowledge transfer or the Curse of Multilinguality for Slavic languages. By evaluating Bi-Directional, Pivot, and Multilingual models, including English as a bridge, the study demonstrates cross-lingual gains within the Slavic family and notable improvements from multilingual training, especially with the MultiSlav+ENG configuration. While commercial LLMs deliver top scores, open-source approaches remain competitive, with MultiSlav+ENG providing the most robust gains among freely available models. The work releases open-source Slavic NMT models on HuggingFace, highlighting practical implications for translating among Czech, Polish, Slovak, Slovene, and English, and offering a foundation for broader cross-lingual transfer research in morphologically rich languages.

Abstract

Does multilingual Neural Machine Translation (NMT) lead to The Curse of the Multlinguality or provides the Cross-lingual Knowledge Transfer within a language family? In this study, we explore multiple approaches for extending the available data-regime in NMT and we prove cross-lingual benefits even in 0-shot translation regime for low-resource languages. With this paper, we provide state-of-the-art open-source NMT models for translating between selected Slavic languages. We released our models on the HuggingFace Hub (https://hf.co/collections/allegro/multislav-6793d6b6419e5963e759a683) under the CC BY 4.0 license. Slavic language family comprises morphologically rich Central and Eastern European languages. Although counting hundreds of millions of native speakers, Slavic Neural Machine Translation is under-studied in our opinion. Recently, most NMT research focuses either on: high-resource languages like English, Spanish, and German - in WMT23 General Translation Task 7 out of 8 task directions are from or to English; massively multilingual models covering multiple language groups; or evaluation techniques.

Paper Structure

This paper contains 31 sections, 4 figures, 13 tables.

Figures (4)

  • Figure 1: Strategies for increasing the data-regime without decreasing the quality of the model illustrated by example of translating from Polish to Czech language. In parenthesis we show how many data points were added compared to baseline.
  • Figure 2: Bi-directional Model translates in both directions between 2 languages.
  • Figure 3: Pivot system uses 2 models: (1) translates from multiple languages to Bridge Language and second from Bridge Language to multiple languages - effectively translating between all supported languages.
  • Figure 4: Multilingual Model directly translates between all supported languages.