On the Hallucination in Simultaneous Machine Translation

Meizhi Zhong; Kehai Chen; Zhengshan Xue; Lemao Liu; Mingming Yang; Min Zhang

On the Hallucination in Simultaneous Machine Translation

Meizhi Zhong, Kehai Chen, Zhengshan Xue, Lemao Liu, Mingming Yang, Min Zhang

TL;DR

This paper investigates hallucination in Simultaneous Machine Translation (SiMT) by analyzing hallucination words through distributional and predictive lenses and by quantifying target-side context usage with a new metric, TSSR. Using Wait-$k$ policies on IWSLT14 De-En and MuST-C Zh-En, the study finds hallucination words exhibit high distribution entropy and high predictive uncertainty, partly due to limited source context. It shows a strong link between target-context reliance and hallucination, and demonstrates that reducing target-context usage via scheduled sampling can modestly improve BLEU and reduce hallucination rates at low latency. The findings offer a practical avenue for mitigating hallucination in SiMT by balancing context usage, while noting limitations related to policy scope and potential alignment biases.

Abstract

It is widely known that hallucination is a critical issue in Simultaneous Machine Translation (SiMT) due to the absence of source-side information. While many efforts have been made to enhance performance for SiMT, few of them attempt to understand and analyze hallucination in SiMT. Therefore, we conduct a comprehensive analysis of hallucination in SiMT from two perspectives: understanding the distribution of hallucination words and the target-side context usage of them. Intensive experiments demonstrate some valuable findings and particularly show that it is possible to alleviate hallucination by decreasing the over usage of target-side information for SiMT.

On the Hallucination in Simultaneous Machine Translation

TL;DR

policies on IWSLT14 De-En and MuST-C Zh-En, the study finds hallucination words exhibit high distribution entropy and high predictive uncertainty, partly due to limited source context. It shows a strong link between target-context reliance and hallucination, and demonstrates that reducing target-context usage via scheduled sampling can modestly improve BLEU and reduce hallucination rates at low latency. The findings offer a practical avenue for mitigating hallucination in SiMT by balancing context usage, while noting limitations related to policy scope and potential alignment biases.

Abstract

Paper Structure (26 sections, 8 equations, 37 figures, 11 tables, 1 algorithm)

This paper contains 26 sections, 8 equations, 37 figures, 11 tables, 1 algorithm.

Introduction
Experimental Settings
SiMT Models and Datasets.
Hallucination Metric.
Understanding Hallucination Words from Distribution
Hallucination is severe in SiMT.
Understanding Hallucination from Frequency Distribution
Hallucination words are with high distribution entropy.
Understanding Hallucination from Predictive Distribution
Hallucination words are difficult to translate.
Hallucination words are difficult to memorize.
Analysis of Target Context Usage for Hallucination Words
Measure on Target-side Context Usage.
The Relationship between Hallucination and Target-side Context Usage
Using more target context leads to more severe hallucination.
...and 11 more sections

Figures (37)

Figure 1: Word frequency of Hallucination and Overall on valid hypotheses set of wait-$1$ (x-axis is ordered randomly, with additional $k$ results in Appendix \ref{['Distribution']}).
Figure 2: HR on the valid set in different TSSR intervals of wait-$k$ models.
Figure 3: Word Frequency Rate of Hallucination and Non-Hallucination in different TSSR intervals for wait-$1$ model.
Figure 4: Word Frequency Rate Change ($\Delta$) in different TSSR intervals with scheduled sampling training compared to the Baselines.
Figure 5: Hallucination Frequency Change ($\Delta$) in different TSSR intervals with scheduled sampling training compared to the Baselines.
...and 32 more figures

On the Hallucination in Simultaneous Machine Translation

TL;DR

Abstract

On the Hallucination in Simultaneous Machine Translation

Authors

TL;DR

Abstract

Table of Contents

Figures (37)