Table of Contents
Fetching ...

Learning to Adapt to Low-Resource Paraphrase Generation

Zhigen Li, Yanmeng Wang, Rizhao Fan, Ye Wang, Jianfeng Li, Shaojun Wang

TL;DR

This work tackles domain shift and data scarcity in paraphrase generation by introducing LAPA, a three stage framework that combines unsupervised pre training, adapter based meta learning on a source domain, and target domain fine tuning. The backbone is a BART large model augmented with adapters; training updates are confined to adapters and normalization layers, guided by a MAML style meta learning objective. Empirical results show state of the art performance across supervised, unsupervised, and low resource settings on three benchmarks, achieving competitive results with only a small fraction of trainable parameters and target labels. The approach demonstrates effective cross domain adaptation with minimal supervision, offering practical benefits for deploying paraphrase systems in data constrained environments.

Abstract

Paraphrase generation is a longstanding NLP task and achieves great success with the aid of large corpora. However, transferring a paraphrasing model to another domain encounters the problem of domain shifting especially when the data is sparse. At the same time, widely using large pre-trained language models (PLMs) faces the overfitting problem when training on scarce labeled data. To mitigate these two issues, we propose, LAPA, an effective adapter for PLMs optimized by meta-learning. LAPA has three-stage training on three types of related resources to solve this problem: 1. pre-training PLMs on unsupervised corpora, 2. inserting an adapter layer and meta-training on source domain labeled data, and 3. fine-tuning adapters on a small amount of target domain labeled data. This method enables paraphrase generation models to learn basic language knowledge first, then learn the paraphrasing task itself later, and finally adapt to the target task. Our experimental results demonstrate that LAPA achieves state-of-the-art in supervised, unsupervised, and low-resource settings on three benchmark datasets. With only 2\% of trainable parameters and 1\% labeled data of the target task, our approach can achieve a competitive performance with previous work.

Learning to Adapt to Low-Resource Paraphrase Generation

TL;DR

This work tackles domain shift and data scarcity in paraphrase generation by introducing LAPA, a three stage framework that combines unsupervised pre training, adapter based meta learning on a source domain, and target domain fine tuning. The backbone is a BART large model augmented with adapters; training updates are confined to adapters and normalization layers, guided by a MAML style meta learning objective. Empirical results show state of the art performance across supervised, unsupervised, and low resource settings on three benchmarks, achieving competitive results with only a small fraction of trainable parameters and target labels. The approach demonstrates effective cross domain adaptation with minimal supervision, offering practical benefits for deploying paraphrase systems in data constrained environments.

Abstract

Paraphrase generation is a longstanding NLP task and achieves great success with the aid of large corpora. However, transferring a paraphrasing model to another domain encounters the problem of domain shifting especially when the data is sparse. At the same time, widely using large pre-trained language models (PLMs) faces the overfitting problem when training on scarce labeled data. To mitigate these two issues, we propose, LAPA, an effective adapter for PLMs optimized by meta-learning. LAPA has three-stage training on three types of related resources to solve this problem: 1. pre-training PLMs on unsupervised corpora, 2. inserting an adapter layer and meta-training on source domain labeled data, and 3. fine-tuning adapters on a small amount of target domain labeled data. This method enables paraphrase generation models to learn basic language knowledge first, then learn the paraphrasing task itself later, and finally adapt to the target task. Our experimental results demonstrate that LAPA achieves state-of-the-art in supervised, unsupervised, and low-resource settings on three benchmark datasets. With only 2\% of trainable parameters and 1\% labeled data of the target task, our approach can achieve a competitive performance with previous work.

Paper Structure

This paper contains 20 sections, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Three training stages of the proposed learning paradigm. Gray represents untrainable parameters, and other bright colors represent parameters that have been trained in different stages.
  • Figure 2: The experimental results of different target data size on Quora for low-resource setting.
  • Figure 3: The experimental results of different source corpus for Quora target task under low-resource setting.