Generating Diverse Translation with Perturbed kNN-MT
Yuto Nishida, Makoto Morishita, Hidetaka Kamigaito, Taro Watanabe
TL;DR
The paper tackles limited translation diversity by addressing overcorrection in NMT with perturbed kNN-MT, introducing noised-, randomized-, and uniquify-kNN variants integrated with diversified decoding. By perturbing the kNN retrieval, the method expands the search space and allows more diverse target tokens to be considered, while maintaining fluency and largely preserving translation quality. The approach leverages datastore-based retrieval, interpolation with MT probabilities, and both static/adaptive noise or random sampling to control diversity through perturbation magnitude. Empirical results across multiple domain adaptation and general-domain language pairs demonstrate substantial gains in diversity (DP) with manageable quality trade-offs, and show that Randomized-kNN often provides the best practical balance without extra cost. The work highlights the potential of combining kNN-MT with diversification to enable more controllable and diverse translation generation in real-world applications, while noting limitations such as latency, memory usage, possible hallucinations, and retrieval challenges that warrant future research.
Abstract
Generating multiple translation candidates would enable users to choose the one that satisfies their needs. Although there has been work on diversified generation, there exists room for improving the diversity mainly because the previous methods do not address the overcorrection problem -- the model underestimates a prediction that is largely different from the training data, even if that prediction is likely. This paper proposes methods that generate more diverse translations by introducing perturbed k-nearest neighbor machine translation (kNN-MT). Our methods expand the search space of kNN-MT and help incorporate diverse words into candidates by addressing the overcorrection problem. Our experiments show that the proposed methods drastically improve candidate diversity and control the degree of diversity by tuning the perturbation's magnitude.
