Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model
Cristian García-Romero, Miquel Esplà-Gomis, Felipe Sánchez-Martínez
TL;DR
This work tackles the problem of identifying machine-translated content within parallel corpora, a key pre-processing step for training high-quality MT systems. It introduces SMaTD, which directly uses latent representations from a pre-trained surrogate multilingual MT model, specifically decoder-block states $h_{k,i}^{(d)}$ projected to $h^{(d')}$, to perform binary HT vs MT classification, with an optional SMaTD+LM extension that concatenates LM-derived features. Empirical results show that SMaTD consistently outperforms state-of-the-art baselines, with especially large gains for non-English language pairs and in zero-shot settings, and it generalizes well across languages and MT systems. The approach enables robust, language-agnostic MT-content filtering for multilingual corpora, contributing to better translation quality and more reliable MT datasets, while also offering insights into which model components carry discriminative MT-detection signals.
Abstract
Modern machine translation (MT) systems depend on large parallel corpora, often collected from the Internet. However, recent evidence indicates that (i) a substantial portion of these texts are machine-generated translations, and (ii) an overreliance on such synthetic content in training data can significantly degrade translation quality. As a result, filtering out non-human translations is becoming an essential pre-processing step in building high-quality MT systems. In this work, we propose a novel approach that directly exploits the internal representations of a surrogate multilingual MT model to distinguish between human and machine-translated sentences. Experimental results show that our method outperforms current state-of-the-art techniques, particularly for non-English language pairs, achieving gains of at least 5 percentage points of accuracy.
