Table of Contents
Fetching ...

Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport

Yuu Jinnai

TL;DR

This work addresses the challenge of document-level text generation by extending Minimum Bayes Risk (MBR) decoding with Optimal Transport (OT). It introduces MBR-OT, which uses a Wasserstein distance-based utility between distributions of document segments, allowing sentence-level utility functions to guide document-level generation despite structural variations like reordering and merging. Across tasks—document-level machine translation, simplification, summarization, and dense image captioning—MBR-OT (particularly WD and its entropic variant) consistently outperforms standard MBR and LA-based baselines, while maintaining robustness to model noise. The approach leverages the strength of sentence-level metrics for document-scale evaluation and offers practical efficiency optimizations, with released code to support reproducibility and further research in document-level generation.

Abstract

Document-level text generation tasks are known to be more difficult than sentence-level text generation tasks as they require the understanding of longer context to generate high-quality texts. In this paper, we investigate the adaption of Minimum Bayes Risk (MBR) decoding for document-level text generation tasks. MBR decoding makes use of a utility function to estimate the output with the highest expected utility from a set of candidate outputs. Although MBR decoding is shown to be effective in a wide range of sentence-level text generation tasks, its performance on document-level text generation tasks is limited as many of the utility functions are designed for evaluating the utility of sentences. To this end, we propose MBR-OT, a variant of MBR decoding using Wasserstein distance to compute the utility of a document using a sentence-level utility function. The experimental result shows that the performance of MBR-OT outperforms that of the standard MBR in document-level machine translation, text simplification, and dense image captioning tasks. Our code is available at https://github.com/jinnaiyuu/mbr-optimal-transport

Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport

TL;DR

This work addresses the challenge of document-level text generation by extending Minimum Bayes Risk (MBR) decoding with Optimal Transport (OT). It introduces MBR-OT, which uses a Wasserstein distance-based utility between distributions of document segments, allowing sentence-level utility functions to guide document-level generation despite structural variations like reordering and merging. Across tasks—document-level machine translation, simplification, summarization, and dense image captioning—MBR-OT (particularly WD and its entropic variant) consistently outperforms standard MBR and LA-based baselines, while maintaining robustness to model noise. The approach leverages the strength of sentence-level metrics for document-scale evaluation and offers practical efficiency optimizations, with released code to support reproducibility and further research in document-level generation.

Abstract

Document-level text generation tasks are known to be more difficult than sentence-level text generation tasks as they require the understanding of longer context to generate high-quality texts. In this paper, we investigate the adaption of Minimum Bayes Risk (MBR) decoding for document-level text generation tasks. MBR decoding makes use of a utility function to estimate the output with the highest expected utility from a set of candidate outputs. Although MBR decoding is shown to be effective in a wide range of sentence-level text generation tasks, its performance on document-level text generation tasks is limited as many of the utility functions are designed for evaluating the utility of sentences. To this end, we propose MBR-OT, a variant of MBR decoding using Wasserstein distance to compute the utility of a document using a sentence-level utility function. The experimental result shows that the performance of MBR-OT outperforms that of the standard MBR in document-level machine translation, text simplification, and dense image captioning tasks. Our code is available at https://github.com/jinnaiyuu/mbr-optimal-transport

Paper Structure

This paper contains 33 sections, 12 equations, 5 figures, 14 tables.

Figures (5)

  • Figure 1: Illustrative example of a metric using Wasserstein distance over two texts "I love cats. I love dogs" and "I love dogs. I love cats.". Each output is segmented into a set of segments (e.g., sentence) and a utility function is used to compute the utility over a pair of segments from each of the outputs. Wasserstein distance is flexible to the change in the structure of the text, making it an adaptive measure for a wide range of tasks.
  • Figure 2: Evaluation of MBR-OT on document-level machine translation tasks. WD metric with MetricX-23 as a sentence-level utility function is used as the evaluation metric.
  • Figure 3: Evaluation of MBR-OT on document-level summarization and simplification tasks.
  • Figure 4: Evaluation of MBR-OT on dense image captioning tasks.
  • Figure 5: Evaluation of MBR-OT on document-level machine translation tasks using COMET-22. Llama-3.1 is used as the text generation model.