Table of Contents
Fetching ...

Automatic Translation Alignment Pipeline for Multilingual Digital Editions of Literary Works

Maria Levchenko

TL;DR

This research highlights the limitations of current state-of-the-art algorithms when applied to the translation of literary texts and outlines an automated pipeline for MDE creation that transforms raw texts into web-based, side-by-side representations of original and translated texts with different rendering options.

Abstract

This paper investigates the application of translation alignment algorithms in the creation of a Multilingual Digital Edition (MDE) of Alessandro Manzoni's Italian novel "I promessi sposi" ("The Betrothed"), with translations in eight languages (English, Spanish, French, German, Dutch, Polish, Russian and Chinese) from the 19th and 20th centuries. We identify key requirements for the MDE to improve both the reader experience and support for translation studies. Our research highlights the limitations of current state-of-the-art algorithms when applied to the translation of literary texts and outlines an automated pipeline for MDE creation. This pipeline transforms raw texts into web-based, side-by-side representations of original and translated texts with different rendering options. In addition, we propose new metrics for evaluating the alignment of literary translations and suggest visualization techniques for future analysis.

Automatic Translation Alignment Pipeline for Multilingual Digital Editions of Literary Works

TL;DR

This research highlights the limitations of current state-of-the-art algorithms when applied to the translation of literary texts and outlines an automated pipeline for MDE creation that transforms raw texts into web-based, side-by-side representations of original and translated texts with different rendering options.

Abstract

This paper investigates the application of translation alignment algorithms in the creation of a Multilingual Digital Edition (MDE) of Alessandro Manzoni's Italian novel "I promessi sposi" ("The Betrothed"), with translations in eight languages (English, Spanish, French, German, Dutch, Polish, Russian and Chinese) from the 19th and 20th centuries. We identify key requirements for the MDE to improve both the reader experience and support for translation studies. Our research highlights the limitations of current state-of-the-art algorithms when applied to the translation of literary texts and outlines an automated pipeline for MDE creation. This pipeline transforms raw texts into web-based, side-by-side representations of original and translated texts with different rendering options. In addition, we propose new metrics for evaluating the alignment of literary translations and suggest visualization techniques for future analysis.

Paper Structure

This paper contains 13 sections, 5 figures.

Figures (5)

  • Figure 1: Alignment Types in The Betrothed
  • Figure 2: The visualization of the alignment of the long sentence.
  • Figure 3: Similarity visualisation for sentence-level alignment in the German translation of chapter 23. This translation omits Don Abbondio's inner monologue, which is not captured in the sentence-level alignment, but is evident in the visualisation, where several Italian sentences appear without corresponding German pairs.
  • Figure 4: Reducing the Lengths of Aligned Pairs for the Spanish 1858
  • Figure 5: The Translation Alignment Pipeline