Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

Anas Himmi; Guillaume Staerman; Marine Picot; Pierre Colombo; Nuno M. Guerreiro

Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

Anas Himmi, Guillaume Staerman, Marine Picot, Pierre Colombo, Nuno M. Guerreiro

TL;DR

This paper tackles hallucination detection in neural machine translation by leveraging detector complementarities through STARE, a simple unsupervised aggregation that normalizes and weights multiple detectors. The method aggregates both external proxies (e.g., quality estimators, cross-lingual similarities) and internal model signals (e.g., Seq-Logprob, attention-based metrics) to produce a single robust hallucination score ${ ext{Agg}(x') = \sum_{k=1}^K w_k s_k(x')}$. Across two human-annotated benchmarks, LfaN-Hall and HalOmi, STARE consistently outperforms individual detectors and other baselines, with notable gains when combining internal detectors which can surpass external-only aggregates. The work provides extensive ablations on detector selection and reference-set size, demonstrates robustness to calibration data, and releases code and scores to foster reproducibility and further research. Overall, STARE offers a practical, effective route to more reliable NMT systems by exploiting detector complementarities in an unsupervised fashion.

Abstract

Hallucinated translations pose significant threats and safety concerns when it comes to the practical deployment of machine translation systems. Previous research works have identified that detectors exhibit complementary performance different detectors excel at detecting different types of hallucinations. In this paper, we propose to address the limitations of individual detectors by combining them and introducing a straightforward method for aggregating multiple detectors. Our results demonstrate the efficacy of our aggregated detector, providing a promising step towards evermore reliable machine translation systems.

Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

TL;DR

. Across two human-annotated benchmarks, LfaN-Hall and HalOmi, STARE consistently outperforms individual detectors and other baselines, with notable gains when combining internal detectors which can surpass external-only aggregates. The work provides extensive ablations on detector selection and reference-set size, demonstrates robustness to calibration data, and releases code and scores to foster reproducibility and further research. Overall, STARE offers a practical, effective route to more reliable NMT systems by exploiting detector complementarities in an unsupervised fashion.

Abstract

Paper Structure (35 sections, 4 equations, 1 figure, 4 tables)

This paper contains 35 sections, 4 equations, 1 figure, 4 tables.

Introduction
Detectors Aggregation Method
Problem Statement
Preliminaries.
Hallucination detection.
Aggregation.
Proposed Aggregation Method
Experimental Setup
Datasets
LfaN-Hall.
HalOmi.
Aggregation Baselines.
Evaluation method.
Implementation details.
Performances Analysis
...and 20 more sections

Figures (1)

Figure 1: Impact of reference set size on LfaN-Hall.

Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

TL;DR

Abstract

Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

Authors

TL;DR

Abstract

Table of Contents

Figures (1)