EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation
Yuqiao Wen, Behzad Shayegh, Chenyang Huang, Yanshuai Cao, Lili Mou
TL;DR
The paper addresses zero-shot multilingual machine translation by introducing EBBS, an ensemble decoding framework that uses bi-level beam search to let each translation path explore its own predictions while a soft-voting mechanism synchronizes results at each generation step. By combining direct and pivot translations, EBBS improves zero-shot quality over traditional direct or pivot approaches and surpasses other ensemble methods on IWSLT and Europarl. Additionally, EBBS-based distillation leverages high-quality ensemble outputs to train a single, efficient model without increasing inference cost, sometimes even boosting translation quality. The approach offers a practical path to stronger zero-shot MT with scalable inference, validated by comprehensive experiments and analyses.
Abstract
The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions. Alternatively, zero-shot translation can be accomplished by pivoting through a third language (e.g., English). In our work, we observe that both direct and pivot translations are noisy and achieve less satisfactory performance. We propose EBBS, an ensemble method with a novel bi-level beam search algorithm, where each ensemble component explores its own prediction step by step at the lower level but they are synchronized by a "soft voting" mechanism at the upper level. Results on two popular multilingual translation datasets show that EBBS consistently outperforms direct and pivot translations as well as existing ensemble techniques. Further, we can distill the ensemble's knowledge back to the multilingual model to improve inference efficiency; profoundly, our EBBS-based distillation does not sacrifice, or even improves, the translation quality.
