Table of Contents
Fetching ...

DP-CRE: Continual Relation Extraction via Decoupled Contrastive Learning and Memory Structure Preservation

Mengyi Huang, Meng Xiao, Ludi Wang, Yi Du

TL;DR

DP-CRE tackles catastrophic forgetting in continual relation extraction by decoupling prior information preservation from new knowledge acquisition. It introduces decoupled contrastive learning for new tasks and a change-amount constraint to preserve memory structure, augmented by multi-task balance and memory-guided prototypes. Empirical results on FewRel and TACRED show state-of-the-art accuracy gains and solid memory-efficiency, with notable robustness to task imbalance. The approach advances practical CRE by stabilizing representations as relation spaces evolve, enabling scalable continual learning in NLP applications.

Abstract

Continuous Relation Extraction (CRE) aims to incrementally learn relation knowledge from a non-stationary stream of data. Since the introduction of new relational tasks can overshadow previously learned information, catastrophic forgetting becomes a significant challenge in this domain. Current replay-based training paradigms prioritize all data uniformly and train memory samples through multiple rounds, which would result in overfitting old tasks and pronounced bias towards new tasks because of the imbalances of the replay set. To handle the problem, we introduce the DecouPled CRE (DP-CRE) framework that decouples the process of prior information preservation and new knowledge acquisition. This framework examines alterations in the embedding space as new relation classes emerge, distinctly managing the preservation and acquisition of knowledge. Extensive experiments show that DP-CRE significantly outperforms other CRE baselines across two datasets.

DP-CRE: Continual Relation Extraction via Decoupled Contrastive Learning and Memory Structure Preservation

TL;DR

DP-CRE tackles catastrophic forgetting in continual relation extraction by decoupling prior information preservation from new knowledge acquisition. It introduces decoupled contrastive learning for new tasks and a change-amount constraint to preserve memory structure, augmented by multi-task balance and memory-guided prototypes. Empirical results on FewRel and TACRED show state-of-the-art accuracy gains and solid memory-efficiency, with notable robustness to task imbalance. The approach advances practical CRE by stabilizing representations as relation spaces evolve, enabling scalable continual learning in NLP applications.

Abstract

Continuous Relation Extraction (CRE) aims to incrementally learn relation knowledge from a non-stationary stream of data. Since the introduction of new relational tasks can overshadow previously learned information, catastrophic forgetting becomes a significant challenge in this domain. Current replay-based training paradigms prioritize all data uniformly and train memory samples through multiple rounds, which would result in overfitting old tasks and pronounced bias towards new tasks because of the imbalances of the replay set. To handle the problem, we introduce the DecouPled CRE (DP-CRE) framework that decouples the process of prior information preservation and new knowledge acquisition. This framework examines alterations in the embedding space as new relation classes emerge, distinctly managing the preservation and acquisition of knowledge. Extensive experiments show that DP-CRE significantly outperforms other CRE baselines across two datasets.
Paper Structure (26 sections, 12 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 26 sections, 12 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: The balance essence of continual relation extraction. Replaying is the period that model parameters compete between learning new data and preserving prior task knowledge.
  • Figure 2: DecouPled Framework of DP-CRE for $T_k$. Green cubes represent prior tasks and yellow cubes represent new tasks. (a) Initial Learning is the routine training on new samples. (b) Replay Learning balances New Knowledge Acquisition and Prior Information Preservation using DecouPled Contrastive Learning and Change Amount Limitation.
  • Figure 3: (a) In the feature space, blue pentagrams indicate old samples while yellow circles represent new ones. (b) Applying contrastive learning to all data would destroy the memory structure information. (c)Retaining old samples unchanged would limit the classification ability of the model. (d) Our approach is to decouple old and new samples so that the structure information is preserved by obtaining a better classification boundary.
  • Figure 4: All ablation study results. We calculate $\Delta$ accuracy (%) between all ablation settings and intact models as table \ref{['T10 Ablation']} for each round.
  • Figure 5: Total training time(s) of DP-CRE and Regular RE. Regular RE is trained using the entire data with the same model architecture.
  • ...and 2 more figures