Table of Contents
Fetching ...

Pluggable Neural Machine Translation Models via Memory-augmented Adapters

Yuzhuang Xu, Shuo Wang, Peng Li, Xuebo Liu, Xiaolong Wang, Weidong Liu, Yang Liu

TL;DR

This work introduces memory-augmented adapters to make pretrained NMT models pluggable and controllable by user-provided text samples, addressing data scarcity and retraining costs. A multi-granular memory stores encoder and decoder representations from phrase-level translations, and a novel memory-augmented adapter with gated fusion integrates retrieved memory into both self- and cross-attention in the decoder. Training uses memory dropout and an agreement term to prevent overreliance on memory, and the approach can be combined with kNN decoding for further gains. Experiments on style and domain customization show consistent improvements over strong baselines in automatic and human evaluations, with robust performance across data scales and reasonable inference-time overhead. The method offers a flexible, scalable path toward personalized NMT without full model fine-tuning, with potential extensions to other sequence generation tasks.

Abstract

Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.

Pluggable Neural Machine Translation Models via Memory-augmented Adapters

TL;DR

This work introduces memory-augmented adapters to make pretrained NMT models pluggable and controllable by user-provided text samples, addressing data scarcity and retraining costs. A multi-granular memory stores encoder and decoder representations from phrase-level translations, and a novel memory-augmented adapter with gated fusion integrates retrieved memory into both self- and cross-attention in the decoder. Training uses memory dropout and an agreement term to prevent overreliance on memory, and the approach can be combined with kNN decoding for further gains. Experiments on style and domain customization show consistent improvements over strong baselines in automatic and human evaluations, with robust performance across data scales and reasonable inference-time overhead. The method offers a flexible, scalable path toward personalized NMT without full model fine-tuning, with potential extensions to other sequence generation tasks.

Abstract

Although neural machine translation (NMT) models perform well in the general domain, it remains rather challenging to control their generation behavior to satisfy the requirement of different users. Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. Specifically, we construct a multi-granular memory based on the user-provided text samples and propose a new adapter architecture to combine the model representations and the retrieved results. We also propose a training strategy using memory dropout to reduce spurious dependencies between the NMT model and the memory. We validate our approach on both style- and domain-specific experiments and the results indicate that our method can outperform several representative pluggable baselines.
Paper Structure (54 sections, 10 equations, 7 figures, 8 tables)

This paper contains 54 sections, 10 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: A frozen and pluggable NMT model using memory-augmented plugins. For each user group with special requirements, we can develop a plugin for them without affecting other users.
  • Figure 2: Construct and integrate memories. (a) We leverage parse trees to obtain multi-granular phrases. Each monolingual phrase is then translated by NMT models. (b) For each phrase pair, we perform a forward computation in the teacher-forcing manner and record some intermediate representations into the memory. (c) Illustration of adapter integration. The adapter retrieve and leverage the memories.
  • Figure 3: Memory-augmented adapter architecture.
  • Figure 4: Performance of style customization at different data scales. "Ours" does not use $k$NN decoding.
  • Figure 5: Inference time reported on the IT domain. "Ours" is not combined with $k$NN decoding.
  • ...and 2 more figures