Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Yichen Dong; Xinglin Lyu; Junhui Li; Daimeng Wei; Min Zhang; Shimin Tao; Hao Yang

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Yichen Dong, Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang

TL;DR

This paper advances document-level machine translation by introducing a two-intermediate refinement framework that leverages both Sent2Sent and Doc2Doc translations. It adds a quality-aware fine-tuning regime across two stages, weighting refinement targets by predicted difficulty to improve final translations. Empirical results across ten En↔X document directions with LLaMA-3-8B-Instruct and Mistral-Nemo-Instruct show consistent gains in d-COMET and related discourse metrics over single-intermediate approaches and strong baselines. The approach also demonstrates robustness through position-bias mitigation, reranking comparisons, and cross-model refinements, underscoring the practical value of combining complementary intermediate translations for document-level MT.

Abstract

Recent research has shown that large language models (LLMs) can enhance translation quality through self-refinement. In this paper, we build on this idea by extending the refinement from sentence-level to document-level translation, specifically focusing on document-to-document (Doc2Doc) translation refinement. Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement using two intermediate translations, combining the strengths of both Sent2Sent and Doc2Doc. Additionally, recognizing that the quality of intermediate translations varies, we introduce an enhanced fine-tuning method with quality awareness that assigns lower weights to easier translations and higher weights to more difficult ones, enabling the model to focus on challenging translation cases. Experimental results across ten translation tasks with LLaMA-3-8B-Instruct and Mistral-Nemo-Instruct demonstrate the effectiveness of our approach.

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

TL;DR

Abstract

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)