Table of Contents
Fetching ...

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

TL;DR

This work tackles the mismatch between estimated posterior probability and translation quality in neural machine translation by introducing source-based MBR (sMBR), which uses paraphrased/back-translated quasi-sources as support hypotheses and a reference-free quality estimator as the utility. It formalizes sMBR, and presents two instantiations, sMBR-PP (paraphrase-based) and sMBR-BT (back-translation-based), demonstrating that sMBR-PP consistently outperforms QE reranking and standard MBR in both classic and LLM-enabled translation settings across multiple language pairs. The approach reveals that leveraging source variants with a QE-based utility yields more robust translation selections, though efficiency remains a challenge. The work points to future improvements in paraphrase quality, diversity strategies, and broader language coverage, offering a new, source-centric decoding paradigm with practical impact for NMT quality.

Abstract

Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. Inspired by Quality Estimation (QE) reranking which uses the QE model as a ranker we propose source-based MBR (sMBR) decoding, a novel approach that utilizes quasi-sources (generated via paraphrasing or back-translation) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

TL;DR

This work tackles the mismatch between estimated posterior probability and translation quality in neural machine translation by introducing source-based MBR (sMBR), which uses paraphrased/back-translated quasi-sources as support hypotheses and a reference-free quality estimator as the utility. It formalizes sMBR, and presents two instantiations, sMBR-PP (paraphrase-based) and sMBR-BT (back-translation-based), demonstrating that sMBR-PP consistently outperforms QE reranking and standard MBR in both classic and LLM-enabled translation settings across multiple language pairs. The approach reveals that leveraging source variants with a QE-based utility yields more robust translation selections, though efficiency remains a challenge. The work points to future improvements in paraphrase quality, diversity strategies, and broader language coverage, offering a new, source-centric decoding paradigm with practical impact for NMT quality.

Abstract

Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. Inspired by Quality Estimation (QE) reranking which uses the QE model as a ranker we propose source-based MBR (sMBR) decoding, a novel approach that utilizes quasi-sources (generated via paraphrasing or back-translation) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.
Paper Structure (34 sections, 8 equations, 3 figures, 14 tables)

This paper contains 34 sections, 8 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: Example of De$\rightarrow$En, with source "Kommt einem Spitzel nahe". BS denotes beam search. The estimated log probability of a human reference is lower than that of the beam search output, and even lower than that of a bad translation.
  • Figure 2: Overview of decoding methods in NMT. The diagram illustrates the process for MBR decoding, QE reranking, and the proposed sMBR decoding. It also shows two practices of sMBR: sMBR-BT and sMBR-PP. The figure demonstrates how the score used for selecting the final hypothesis is computed for each method.
  • Figure 3: Impact of the number of candidate hypotheses on the evaluation metrics: in the En$\rightarrow$De high resource setup. The horizontal axis indicates the number of candidate hypotheses and the vertical axis indicates the evaluation indicators.