Table of Contents
Fetching ...

Extending CREAMT: Leveraging Large Language Models for Literary Translation Post-Editing

Antonio Castaldo, Sheila Castilho, Joss Moorkens, Johanna Monti

TL;DR

The paper addresses the challenge of translating literature efficiently while preserving creativity by evaluating post-editing of LLM-generated translations. It compares GPT-4, GPT-3.5, and a domain-adapted Mistral-7B, using a custom UniOr-PET workflow with professional translators translating an Atwood novel excerpt into Italian. Key findings show significant reductions in editing time with LLM post-edits, while creativity remains comparable to human translation, particularly with the domain-adapted model. The work demonstrates the practical potential of LLMs to assist literary translators in high-resource languages and outlines a path for integrating domain-adapted models into creative translation workflows.

Abstract

Post-editing machine translation (MT) for creative texts, such as literature, requires balancing efficiency with the preservation of creativity and style. While neural MT systems struggle with these challenges, large language models (LLMs) offer improved capabilities for context-aware and creative translation. This study evaluates the feasibility of post-editing literary translations generated by LLMs. Using a custom research tool, we collaborated with professional literary translators to analyze editing time, quality, and creativity. Our results indicate that post-editing LLM-generated translations significantly reduces editing time compared to human translation while maintaining a similar level of creativity. The minimal difference in creativity between PE and MT, combined with substantial productivity gains, suggests that LLMs may effectively support literary translators working with high-resource languages.

Extending CREAMT: Leveraging Large Language Models for Literary Translation Post-Editing

TL;DR

The paper addresses the challenge of translating literature efficiently while preserving creativity by evaluating post-editing of LLM-generated translations. It compares GPT-4, GPT-3.5, and a domain-adapted Mistral-7B, using a custom UniOr-PET workflow with professional translators translating an Atwood novel excerpt into Italian. Key findings show significant reductions in editing time with LLM post-edits, while creativity remains comparable to human translation, particularly with the domain-adapted model. The work demonstrates the practical potential of LLMs to assist literary translators in high-resource languages and outlines a path for integrating domain-adapted models into creative translation workflows.

Abstract

Post-editing machine translation (MT) for creative texts, such as literature, requires balancing efficiency with the preservation of creativity and style. While neural MT systems struggle with these challenges, large language models (LLMs) offer improved capabilities for context-aware and creative translation. This study evaluates the feasibility of post-editing literary translations generated by LLMs. Using a custom research tool, we collaborated with professional literary translators to analyze editing time, quality, and creativity. Our results indicate that post-editing LLM-generated translations significantly reduces editing time compared to human translation while maintaining a similar level of creativity. The minimal difference in creativity between PE and MT, combined with substantial productivity gains, suggests that LLMs may effectively support literary translators working with high-resource languages.

Paper Structure

This paper contains 18 sections, 2 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Each translator translates from scratch one chunk of original text (Translation) and post-edits a different chunk of each model's output (Model X, Y, Z), minimizing the translator's effect.
  • Figure 2: UniOr PET user interface
  • Figure 3: Quality metrics scores (BLEU, chrF, COMET) for different MT systems.
  • Figure 4: The original creativity score formula, that we started from to create our score.