Table of Contents
Fetching ...

Translating Step-by-Step: Decomposing the Translation Process for Improved Translation Quality of Long-Form Texts

Eleftheria Briakou, Jiaming Luo, Colin Cherry, Markus Freitag

TL;DR

Instead of viewing machine translation as a single, monolithic task, this paper proposes a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations.

Abstract

In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. Instead of viewing machine translation as a single, monolithic task, we propose a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations. Extensive automatic evaluations using Gemini 1.5 Pro across ten language pairs show that translating step-by-step yields large translation quality improvements over conventional zero-shot prompting approaches and earlier human-like baseline strategies, resulting in state-of-the-art results on WMT2024.

Translating Step-by-Step: Decomposing the Translation Process for Improved Translation Quality of Long-Form Texts

TL;DR

Instead of viewing machine translation as a single, monolithic task, this paper proposes a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations.

Abstract

In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. Instead of viewing machine translation as a single, monolithic task, we propose a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations. Extensive automatic evaluations using Gemini 1.5 Pro across ten language pairs show that translating step-by-step yields large translation quality improvements over conventional zero-shot prompting approaches and earlier human-like baseline strategies, resulting in state-of-the-art results on WMT2024.
Paper Structure (32 sections, 3 figures, 13 tables)

This paper contains 32 sections, 3 figures, 13 tables.

Figures (3)

  • Figure 1: MetricX-$23$ quality improvements (where lower scores indicate better translation quality) on document-level translation on the wmt$24$ test set. Translate step-by-step with Gemini $1.5$ Pro consistently outperforms zero-shot translation.
  • Figure 2: Translate Step-by-Step prompting framework. User prompts (top) and Gemini's responses (bottom) for the translation of an English document into Chinese. The full prompts for each step also appear in §\ref{['sec:appendix_prompts']}.
  • Figure 3: Domain-level comparison between zero-shot and step-by-step translations on wmt$2024$ using reference-based MetricX-$23$. Each data point represents the delta from zero-shot (dotted horizontal line). The steps are denoted as follows: 0 (zero-shot), D (draft after research), R (refinement), and P (proofreading).