Multi-Source Neural Translation
Barret Zoph, Kevin Knight
TL;DR
The paper addresses translating into English using two source languages to reduce ambiguity via triangulation by directly modeling $P(e|f,g)$ in a neural encoder-decoder framework. It compares three fusion approaches—Basic concatenation, Child-Sum, and Multi-Source Attention—to combine two source encodings before decoding. Using the WMT 2014 tri-source dataset, it achieves up to +$4.8$ BLEU gains over a strong single-source baseline, with larger gains when the sources are more linguistically distant, demonstrating the effectiveness of explicit multi-source integration. The work also analyzes attention behaviors and releases code to support reproducibility and further research.
Abstract
We build a multi-source machine translation model and train it to maximize the probability of a target English string given French and German sources. Using the neural encoder-decoder framework, we explore several combination methods and report up to +4.8 Bleu increases on top of a very strong attention-based neural translation model.
