Table of Contents
Fetching ...

Towards Chapter-to-Chapter Context-Aware Literary Translation via Large Language Models

Linghao Jin, Li An, Xuezhe Ma

TL;DR

This work tackles the limited use of long-range discourse in document-level machine translation by introducing a chapter-to-chapter (Ch2Ch) setting and a chapter-aligned JAM dataset of 160 English-Chinese novels. It benchmarks encoder–decoder and decoder-only MT models, and demonstrates that a two-stage fine-tuning pipeline (sentence-level followed by chapter-level) substantially improves literary translation quality, while long-context decoding introduces repetition that requires post-processing and improved decoding strategies. The study also evaluates large language models, finding GPT-4 excels in zero-shot translation, while fine-tuned decoder-only LLMs like ALMA-7B-Stage2 reach competitive performance, especially with chapter-level context. Overall, Ch2Ch emerges as a realistic and valuable framework for context-aware literary translation, with JAM enabling more robust analysis of discourse phenomena and long-context translation challenges.

Abstract

Discourse phenomena in existing document-level translation datasets are sparse, which has been a fundamental obstacle in the development of context-aware machine translation models. Moreover, most existing document-level corpora and context-aware machine translation methods rely on an unrealistic assumption on sentence-level alignments. To mitigate these issues, we first curate a novel dataset of Chinese-English literature, which consists of 160 books with intricate discourse structures. Then, we propose a more pragmatic and challenging setting for context-aware translation, termed chapter-to-chapter (Ch2Ch) translation, and investigate the performance of commonly-used machine translation models under this setting. Furthermore, we introduce a potential approach of finetuning large language models (LLMs) within the domain of Ch2Ch literary translation, yielding impressive improvements over baselines. Through our comprehensive analysis, we unveil that literary translation under the Ch2Ch setting is challenging in nature, with respect to both model learning methods and translation decoding algorithms.

Towards Chapter-to-Chapter Context-Aware Literary Translation via Large Language Models

TL;DR

This work tackles the limited use of long-range discourse in document-level machine translation by introducing a chapter-to-chapter (Ch2Ch) setting and a chapter-aligned JAM dataset of 160 English-Chinese novels. It benchmarks encoder–decoder and decoder-only MT models, and demonstrates that a two-stage fine-tuning pipeline (sentence-level followed by chapter-level) substantially improves literary translation quality, while long-context decoding introduces repetition that requires post-processing and improved decoding strategies. The study also evaluates large language models, finding GPT-4 excels in zero-shot translation, while fine-tuned decoder-only LLMs like ALMA-7B-Stage2 reach competitive performance, especially with chapter-level context. Overall, Ch2Ch emerges as a realistic and valuable framework for context-aware literary translation, with JAM enabling more robust analysis of discourse phenomena and long-context translation challenges.

Abstract

Discourse phenomena in existing document-level translation datasets are sparse, which has been a fundamental obstacle in the development of context-aware machine translation models. Moreover, most existing document-level corpora and context-aware machine translation methods rely on an unrealistic assumption on sentence-level alignments. To mitigate these issues, we first curate a novel dataset of Chinese-English literature, which consists of 160 books with intricate discourse structures. Then, we propose a more pragmatic and challenging setting for context-aware translation, termed chapter-to-chapter (Ch2Ch) translation, and investigate the performance of commonly-used machine translation models under this setting. Furthermore, we introduce a potential approach of finetuning large language models (LLMs) within the domain of Ch2Ch literary translation, yielding impressive improvements over baselines. Through our comprehensive analysis, we unveil that literary translation under the Ch2Ch setting is challenging in nature, with respect to both model learning methods and translation decoding algorithms.
Paper Structure (36 sections, 2 equations, 9 figures, 8 tables)

This paper contains 36 sections, 2 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: An example of of Ch2Ch translation. Sentence Misalignment: foored!20 Red parts are where a source sentence is separated into multiple sentences in the corresponding translation; foocyan!20 blue parts are added by translators and do not have a corresponding source segment; fooviolet!20 violet parts are deleted by translators in translation.
  • Figure 2: Decoder-only architecture.
  • Figure 3: Prompt template for LLMs.
  • Figure 4: Left: Repetition start position in each sentence; Right: Repetition distribution across various context length
  • Figure 5: Comparison between finetuned ALMA-7B on JAM, with versus w/o post-repetition removal processing.
  • ...and 4 more figures