Table of Contents
Fetching ...

Discourse Graph Guided Document Translation with Large Language Models

Viet-Thanh Pham, Minghan Wang, Hao-Han Liao, Thuy-Trang Vu

TL;DR

TransGraph introduces a discourse-graph guided approach to document-level machine translation. By partitioning text into coherent chunks and constructing a labeled discourse graph, it selectively conditions each chunk's translation on a small, graph-neighbourhood context rather than the full document, achieving robust improvements in d-BLEU, d-COMET, and terminology accuracy while reducing token overhead. Across three benchmarks and multiple LLM backbones, TransGraph outperforms sentence-level, single-pass, and agent-based baselines, with strong ablations confirming the value of coherent chunking, explicit discourse relations, and graph structure. The method demonstrates backbone-agnostic efficiency and cross-lingual robustness, highlighting structured discourse retrieval as a practical lever for high-quality DocMT.

Abstract

Adapting large language models to full document translation remains challenging due to the difficulty of capturing long-range dependencies and preserving discourse coherence throughout extended texts. While recent agentic machine translation systems mitigate context window constraints through multi-agent orchestration and persistent memory, they require substantial computational resources and are sensitive to memory retrieval strategies. We introduce TransGraph, a discourse-guided framework that explicitly models inter-chunk relationships through structured discourse graphs and selectively conditions each translation segment on relevant graph neighbourhoods rather than relying on sequential or exhaustive context. Across three document-level MT benchmarks spanning six languages and diverse domains, TransGraph consistently surpasses strong baselines in translation quality and terminology consistency while incurring significantly lower token overhead.

Discourse Graph Guided Document Translation with Large Language Models

TL;DR

TransGraph introduces a discourse-graph guided approach to document-level machine translation. By partitioning text into coherent chunks and constructing a labeled discourse graph, it selectively conditions each chunk's translation on a small, graph-neighbourhood context rather than the full document, achieving robust improvements in d-BLEU, d-COMET, and terminology accuracy while reducing token overhead. Across three benchmarks and multiple LLM backbones, TransGraph outperforms sentence-level, single-pass, and agent-based baselines, with strong ablations confirming the value of coherent chunking, explicit discourse relations, and graph structure. The method demonstrates backbone-agnostic efficiency and cross-lingual robustness, highlighting structured discourse retrieval as a practical lever for high-quality DocMT.

Abstract

Adapting large language models to full document translation remains challenging due to the difficulty of capturing long-range dependencies and preserving discourse coherence throughout extended texts. While recent agentic machine translation systems mitigate context window constraints through multi-agent orchestration and persistent memory, they require substantial computational resources and are sensitive to memory retrieval strategies. We introduce TransGraph, a discourse-guided framework that explicitly models inter-chunk relationships through structured discourse graphs and selectively conditions each translation segment on relevant graph neighbourhoods rather than relying on sequential or exhaustive context. Across three document-level MT benchmarks spanning six languages and diverse domains, TransGraph consistently surpasses strong baselines in translation quality and terminology consistency while incurring significantly lower token overhead.

Paper Structure

This paper contains 33 sections, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Illustration of our proposed framework, TransGraph. In Stage 1, TransGraph takes a document as input, split it into multiple small chunks and identify the discourse relations between every pair of chunks and represent them as a knowledge graph. In Stage 2, the translation of one chunk proceeds by retrieving adjacent chunks in the graph with their corresponding relations.
  • Figure 2: Distribution of ratios of 5-nearest relations out of the total number of relations. Distribution is calculated on the documents of BWB and ACL 60/60 benchmarks.