Table of Contents
Fetching ...

Using Source-Side Confidence Estimation for Reliable Translation into Unfamiliar Languages

Kenneth J. Sible, David Chiang

TL;DR

The paper addresses translating into unfamiliar target languages by focusing on source-side confidence cues. It introduces a gradient-based, alignment-free confidence estimator that measures the sensitivity of output probabilities to source embeddings, formalized as $U(x_i)=\sum_{k=1}^{|\mathbf{x}_i|}\left|\frac{\partial\mathbb{P}(y_1,\ldots,y_m\mid x_1,\ldots,x_n)}{\partial\mathbf{x}_i^k}\right|$ and aggregates subword uncertainties, with an intuitive thresholding strategy. An interactive MT system highlights uncertain source words and proposes edits, while an evaluation framework using GPT-4o provides scalable mistranslation annotations and metrics (F1, AUC). The results show the proposed method outperforms alignment-based baselines, and the work demonstrates practical applications, including a mobile-ready web app and plans for broader language coverage and dictionary-assisted corrections. This work advances transparent, user-guided MT by shifting the confidence signal to the source-side and enabling targeted user intervention.

Abstract

We present an interactive machine translation (MT) system designed for users who are not proficient in the target language. It aims to improve trustworthiness and explainability by identifying potentially mistranslated words and allowing the user to intervene to correct mistranslations. However, confidence estimation in machine translation has traditionally focused on the target side. Whereas the conventional approach to source-side confidence estimation would have been to project target word probabilities to the source side via word alignments, we propose a direct, alignment-free approach that measures how sensitive the target word probabilities are to changes in the source embeddings. Experimental results show that our method outperforms traditional alignment-based methods at detection of mistranslations.

Using Source-Side Confidence Estimation for Reliable Translation into Unfamiliar Languages

TL;DR

The paper addresses translating into unfamiliar target languages by focusing on source-side confidence cues. It introduces a gradient-based, alignment-free confidence estimator that measures the sensitivity of output probabilities to source embeddings, formalized as and aggregates subword uncertainties, with an intuitive thresholding strategy. An interactive MT system highlights uncertain source words and proposes edits, while an evaluation framework using GPT-4o provides scalable mistranslation annotations and metrics (F1, AUC). The results show the proposed method outperforms alignment-based baselines, and the work demonstrates practical applications, including a mobile-ready web app and plans for broader language coverage and dictionary-assisted corrections. This work advances transparent, user-guided MT by shifting the confidence signal to the source-side and enabling targeted user intervention.

Abstract

We present an interactive machine translation (MT) system designed for users who are not proficient in the target language. It aims to improve trustworthiness and explainability by identifying potentially mistranslated words and allowing the user to intervene to correct mistranslations. However, confidence estimation in machine translation has traditionally focused on the target side. Whereas the conventional approach to source-side confidence estimation would have been to project target word probabilities to the source side via word alignments, we propose a direct, alignment-free approach that measures how sensitive the target word probabilities are to changes in the source embeddings. Experimental results show that our method outperforms traditional alignment-based methods at detection of mistranslations.

Paper Structure

This paper contains 9 sections, 1 equation, 5 figures, 1 table.

Figures (5)

  • Figure 1: A web application for an interactive MT system that highlights potentially mistranslated words that have been assigned high uncertainty scores by our gradient-based attribution method.
  • Figure 2: A GPT-4o mistranslation detection prompt. The prompt is written to detect mistranslation pairs between a source sentence and a candidate sentence, and to match each pair with a word from the provided reference translation. In addition to these instructions, the prompt also contains some examples with explanations.
  • Figure 3: A German source sentence from the test set along with the reference sentence, the MT candidate translation, and the GPT-4o output for the mistranslation detection task. The MT output translates back to "Greenland swims rapidly along the seafloor of the North Atlantic and covers an average of 1,220 meters per hour."
  • Figure 4: Precision-Recall (PR) and Receiver Operating Characteristic (ROC) curves comparing our gradient-based attribution method with two baseline approaches for mistranslation detection. The PR curve (left) captures the performance of the positive class, which is more relevant for confidence estimation. The ROC curve (right) reflects the overall discriminative power of these methods as binary classifiers for mistranslation detection.
  • Figure 5: An example of using the web application for an interactive MT system to translate a sentence from English to German. To start, the user enters an English sentence and clicks the translate button (left). When the translation appears, any potentially mistranslated words in the input are highlighted (middle). The user can then click on a highlighted word to see suggestions for alternative words, or enter their own replacement (right).