Table of Contents
Fetching ...

Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms

Firas Trabelsi, David Vilar, Mara Finkelstein, Markus Freitag

TL;DR

This work tackles the computational bottleneck of Minimum Bayes Risk decoding in neural MT by proving that the MBR score matrix is largely low-rank and can be completed from a small observed subset. It introduces PMBR, which uses ALS to recover missing utilities and then runs standard MBR, achieving up to a 16x reduction in neural metric computations while preserving translation quality on WMT22 benchmarks. Empirical analyses show a dominant first singular value across metrics, enabling effective rank-1 approximations, and human evaluation corroborates the automatic metrics. The approach offers a practical pathway to deploying MBR in MT and potentially other NLG tasks, with room for exploring alternative completions and broader domains.

Abstract

Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding using matrix completion techniques, focusing on the task of machine translation. We formulate MBR decoding as a matrix completion problem, where the utility metric scores between candidate hypotheses and pseudo-reference translations form a low-rank matrix. First, we empirically show that the scores matrices indeed have a low-rank structure. Then, we exploit this by only computing a random subset of the scores and efficiently recover the missing entries in the matrix by applying the Alternating Least Squares (ALS) algorithm, thereby enabling a fast approximation of the MBR decoding process. Our experimental results on machine translation tasks demonstrate that the proposed method requires 1/16 utility metric computations compared to vanilla MBR decoding while achieving equal translation quality measured by COMET22 on the WMT22 dataset (en<>de and en<>ru). We also benchmark our method against other approximation methods and we show gains in quality when comparing to them.

Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms

TL;DR

This work tackles the computational bottleneck of Minimum Bayes Risk decoding in neural MT by proving that the MBR score matrix is largely low-rank and can be completed from a small observed subset. It introduces PMBR, which uses ALS to recover missing utilities and then runs standard MBR, achieving up to a 16x reduction in neural metric computations while preserving translation quality on WMT22 benchmarks. Empirical analyses show a dominant first singular value across metrics, enabling effective rank-1 approximations, and human evaluation corroborates the automatic metrics. The approach offers a practical pathway to deploying MBR in MT and potentially other NLG tasks, with room for exploring alternative completions and broader domains.

Abstract

Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding using matrix completion techniques, focusing on the task of machine translation. We formulate MBR decoding as a matrix completion problem, where the utility metric scores between candidate hypotheses and pseudo-reference translations form a low-rank matrix. First, we empirically show that the scores matrices indeed have a low-rank structure. Then, we exploit this by only computing a random subset of the scores and efficiently recover the missing entries in the matrix by applying the Alternating Least Squares (ALS) algorithm, thereby enabling a fast approximation of the MBR decoding process. Our experimental results on machine translation tasks demonstrate that the proposed method requires 1/16 utility metric computations compared to vanilla MBR decoding while achieving equal translation quality measured by COMET22 on the WMT22 dataset (en<>de and en<>ru). We also benchmark our method against other approximation methods and we show gains in quality when comparing to them.
Paper Structure (26 sections, 2 equations, 5 figures, 6 tables, 2 algorithms)

This paper contains 26 sections, 2 equations, 5 figures, 6 tables, 2 algorithms.

Figures (5)

  • Figure 1: Plot the singular values of an example 124x124 MBR matrix using logscale. We observe a sharp drop after the first singular value for the two utility metrics indicating that the matrix is rank-1.
  • Figure 2: We scored WMT22 DeEn dataset 1000 times for each budget available. Each scoring picks without replacement 128 samples from the 1024 samples available for each sentence. The highlighted area shows the standard deviation of the scores.
  • Figure 3: We scored WMT22 DeEn dataset 1000 times for each budget available. Each scoring picks without replacement 32 samples from the 1024 samples available for each sentence. The highlighted area shows the standard deviation of the scores.
  • Figure 4: We scored WMT22 DeEn dataset 1000 times for each budget available. Each scoring picks without replacement 64 samples from the 1024 samples available for each sentence. The highlighted area shows the standard deviation of the scores.
  • Figure 5: We scored WMT22 DeEn dataset 1000 times for each budget available. Each scoring picks without replacement 256 samples from the 1024 samples available for each sentence. The highlighted area shows the standard deviation of the scores.