Table of Contents
Fetching ...

Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective

Minh Le, Tien Ngoc Luu, An Nguyen The, Thanh-Thien Le, Trang Nguyen, Tung Thanh Nguyen, Linh Ngo Van, Thien Huu Nguyen

TL;DR

This paper addresses catastrophic forgetting in continual relation extraction by moving beyond replay-based memory storage to a rehearsal-free framework. It introduces WAVE-CRE, which uses per-task prompt pools to capture within-task variance and a generative latent replay mechanism to consolidate knowledge across tasks. A task predictor and relation classifier enable efficient task-conditional processing, while the prompt pools and sparse gating promote cross-task divergence and scalable specialization. Empirical results on FewRel and TACRED show WAVE-CRE outperforms state-of-the-art prompt-based and rehearsal-free baselines and matches or nearly matches rehearsal-based methods, highlighting its practical impact for privacy-preserving continual learning in relation extraction.

Abstract

To address catastrophic forgetting in Continual Relation Extraction (CRE), many current approaches rely on memory buffers to rehearse previously learned knowledge while acquiring new tasks. Recently, prompt-based methods have emerged as potent alternatives to rehearsal-based strategies, demonstrating strong empirical performance. However, upon analyzing existing prompt-based approaches for CRE, we identified several critical limitations, such as inaccurate prompt selection, inadequate mechanisms for mitigating forgetting in shared parameters, and suboptimal handling of cross-task and within-task variances. To overcome these challenges, we draw inspiration from the relationship between prefix-tuning and mixture of experts, proposing a novel approach that employs a prompt pool for each task, capturing variations within each task while enhancing cross-task variances. Furthermore, we incorporate a generative model to consolidate prior knowledge within shared parameters, eliminating the need for explicit data storage. Extensive experiments validate the efficacy of our approach, demonstrating superior performance over state-of-the-art prompt-based and rehearsal-free methods in continual relation extraction.

Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective

TL;DR

This paper addresses catastrophic forgetting in continual relation extraction by moving beyond replay-based memory storage to a rehearsal-free framework. It introduces WAVE-CRE, which uses per-task prompt pools to capture within-task variance and a generative latent replay mechanism to consolidate knowledge across tasks. A task predictor and relation classifier enable efficient task-conditional processing, while the prompt pools and sparse gating promote cross-task divergence and scalable specialization. Empirical results on FewRel and TACRED show WAVE-CRE outperforms state-of-the-art prompt-based and rehearsal-free baselines and matches or nearly matches rehearsal-based methods, highlighting its practical impact for privacy-preserving continual learning in relation extraction.

Abstract

To address catastrophic forgetting in Continual Relation Extraction (CRE), many current approaches rely on memory buffers to rehearse previously learned knowledge while acquiring new tasks. Recently, prompt-based methods have emerged as potent alternatives to rehearsal-based strategies, demonstrating strong empirical performance. However, upon analyzing existing prompt-based approaches for CRE, we identified several critical limitations, such as inaccurate prompt selection, inadequate mechanisms for mitigating forgetting in shared parameters, and suboptimal handling of cross-task and within-task variances. To overcome these challenges, we draw inspiration from the relationship between prefix-tuning and mixture of experts, proposing a novel approach that employs a prompt pool for each task, capturing variations within each task while enhancing cross-task variances. Furthermore, we incorporate a generative model to consolidate prior knowledge within shared parameters, eliminating the need for explicit data storage. Extensive experiments validate the efficacy of our approach, demonstrating superior performance over state-of-the-art prompt-based and rehearsal-free methods in continual relation extraction.

Paper Structure

This paper contains 23 sections, 17 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Overall framework of WAVE-CRE. To prevent information loss across tasks, we use a task-specific prompt pool $\mathbf{P}_t$ for each task and a representation generator to synthesize past-task information, strengthening the relation classifier's knowledge retention.
  • Figure 2: Data Flow Diagram: Initially, the task predictor predicts the task identity of the input $\boldsymbol{x}$, enabling the selection of the corresponding prompt pool. Subsequently, the input $\boldsymbol{x}$ queries this prompt pool to identify prompts whose corresponding keys are closest to the $query \ q({\bm x})$. The chosen prompt is then prepended to the embedded input $\boldsymbol{x}_e$, creating the prompted input $\boldsymbol{x}_p$. The combined $\boldsymbol{x}_p$ is fed into the BERT Encoder, where the two embeddings corresponding to the positions of the entities $E_1$ and $E_2$ are concatenated. Finally, the resulting concatenated embedding is passed to the relation classifier, which predicts the relation label $y$ of the input $\boldsymbol{x}$.