Pluggable Neural Machine Translation Models via Memory-augmented Adapters
Yuzhuang Xu, Shuo Wang, Peng Li, Xuebo Liu, Xiaolong Wang, Weidong Liu, Yang Liu
TL;DR
This work introduces memory-augmented adapters to make pretrained NMT models pluggable and controllable by user-provided text samples, addressing data scarcity and retraining costs. A multi-granular memory stores encoder and decoder representations from phrase-level translations, and a novel memory-augmented adapter with gated fusion integrates retrieved memory into both self- and cross-attention in the decoder. Training uses memory dropout and an agreement term to prevent overreliance on memory, and the approach can be combined with kNN decoding for further gains. Experiments on style and domain customization show consistent improvements over strong baselines in automatic and human evaluations, with robust performance across data scales and reasonable inference-time overhead. The method offers a flexible, scalable path toward personalized NMT without full model fine-tuning, with potential extensions to other sequence generation tasks.
Abstract
Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.
