Self-generated Replay Memories for Continual Neural Machine Translation
Michele Resta, Davide Bacciu
TL;DR
This work tackles catastrophic forgetting in continual multilingual neural machine translation by proposing SG-Rep, a replay-based method that uses the model itself as a generator of synthetic parallel sentences. The approach maintains a fixed-size replay memory populated with self-generated pseudo-samples, which are filtered and translated to form training data for future experiences, thereby mitigating forgetting without explicit memorization of real past data. Across IWSLT17 and UNPC datasets, SG-Rep consistently outperforms traditional continual learning baselines and approaches the performance of joint training, demonstrating strong robustness to experience order and token diversity challenges. The method offers a practical pathway for continual, privacy-conscious multilingual NMT with manageable computational overhead and clear applicability to real-world multilingual deployment.
Abstract
Modern Neural Machine Translation systems exhibit strong performance in several different languages and are constantly improving. Their ability to learn continuously is, however, still severely limited by the catastrophic forgetting issue. In this work, we leverage a key property of encoder-decoder Transformers, i.e. their generative ability, to propose a novel approach to continually learning Neural Machine Translation systems. We show how this can effectively learn on a stream of experiences comprising different languages, by leveraging a replay memory populated by using the model itself as a generator of parallel sentences. We empirically demonstrate that our approach can counteract catastrophic forgetting without requiring explicit memorization of training data. Code will be publicly available upon publication. Code: https://github.com/m-resta/sg-rep
