Optimizing example selection for retrieval-augmented machine translation with translation memories

Maxime Bouthors; Josep Crego; François Yvon

Optimizing example selection for retrieval-augmented machine translation with translation memories

Maxime Bouthors, Josep Crego, François Yvon

TL;DR

The paper investigates how to optimally select a set of $k$ similar examples from a translation memory to condition a fixed, non-autoregressive downstream editor—the multi-Levenshtein Transformer. It casts exemplar selection as a submodular coverage problem, introducing a smoothed, weight-aware coverage function controlled by a parameter $\lambda$ and providing both greedy and approximation-based algorithms with theoretical guarantees. Empirically, Levenshtein-distance based coverage tends to yield the largest BLEU gains across multiple English–French domains, though improvements are domain-sensitive and influenced by normalization and $\lambda$-tuning. The work suggests that efficient, coverage-driven retrieval can modestly enhance translation quality in retrieval-augmented MT, while highlighting that longer, more covering examples may complicate joint editing and that future work should align retrieval with the translation model's editing behavior.

Abstract

Retrieval-augmented machine translation leverages examples from a translation memory by retrieving similar instances. These examples are used to condition the predictions of a neural decoder. We aim to improve the upstream retrieval step and consider a fixed downstream edit-based model: the multi-Levenshtein Transformer. The task consists of finding a set of examples that maximizes the overall coverage of the source sentence. To this end, we rely on the theory of submodular functions and explore new algorithms to optimize this coverage. We evaluate the resulting performance gains for the machine translation task.

Optimizing example selection for retrieval-augmented machine translation with translation memories

TL;DR

The paper investigates how to optimally select a set of

similar examples from a translation memory to condition a fixed, non-autoregressive downstream editor—the multi-Levenshtein Transformer. It casts exemplar selection as a submodular coverage problem, introducing a smoothed, weight-aware coverage function controlled by a parameter

and providing both greedy and approximation-based algorithms with theoretical guarantees. Empirically, Levenshtein-distance based coverage tends to yield the largest BLEU gains across multiple English–French domains, though improvements are domain-sensitive and influenced by normalization and

-tuning. The work suggests that efficient, coverage-driven retrieval can modestly enhance translation quality in retrieval-augmented MT, while highlighting that longer, more covering examples may complicate joint editing and that future work should align retrieval with the translation model's editing behavior.

Abstract

Paper Structure (25 sections, 27 equations, 7 figures, 5 tables, 2 algorithms)

This paper contains 25 sections, 27 equations, 7 figures, 5 tables, 2 algorithms.

Introduction
Travaux Connexes
Le modèle multi-Levenshtein
Recherche d'Information dans une Mémoire de Traduction
Recherche de Phrases Similaires (RPS)
Fonctions sous-modulaires et couverture
Définitions
Maximisation de la couverture
Cadre Expérimental
Données
Scores de Couverture
Métriques
Résultats
Rôle du score choisi :
Rôle de $\lambda$ :
...and 10 more sections

Figures (7)

Figure 1: Première étape de décodage de TM$^{k}$-LevT. Les deux exemples qui sont édités sont $y_1$: le chat dort sur un lit et $y_2$: le chien est sur le grand tapis. Les insertions prédites à l'étape 2 (insertion) sont représentées par des entiers, puis matérialisées par des '_' à l'étape 3 (combinaison).
Figure 2: Scores BLEU moyens selon le score de la similarité DL, pour $\lambda \in \{0; 0,2; 1\}$ sur test-0.4 (gauche) et test-0.6 (droite).
Figure 3: Couverture et pertinence moyenne selon la valeur de $\lambda$ pour DL sur test-0.4 (gauche) et test-0.6 (droite).
Figure 4: Longueur moyenne selon la valeur de $\lambda$ pour DL sur test-0.4 (gauche) et test-0.6 (droite).
Figure 5: Score BLEU moyen en fonction de $\lambda$ pour DL sur test-0.4 (gauche) et test-0.6 (droite).
...and 2 more figures

Theorems & Definitions (6)

proof
proof
proof
proof
proof
proof

Optimizing example selection for retrieval-augmented machine translation with translation memories

TL;DR

Abstract

Optimizing example selection for retrieval-augmented machine translation with translation memories

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (6)