Table of Contents
Fetching ...

Elastic Weight Consolidation for Knowledge Graph Continual Learning: An Empirical Evaluation

Gaganpreet Jhajj, Fuhua Lin

TL;DR

This study evaluates Elastic Weight Consolidation (EWC) for continual learning in knowledge graph link prediction using TransE on FB15k-237. It shows that EWC substantially reduces catastrophic forgetting across sequential tasks, with the strongest setting (λ=10) lowering forgetting from 12.62% to 6.85% and improving final MRR. The results also reveal that how tasks are constructed (relation-based vs random partitioning) greatly affects forgetting, underscoring the need for careful evaluation protocols. The work highlights EWC as an effective regularization-based approach for KG continual learning, while calling for broader generalization studies across datasets, embeddings, and task sequences.

Abstract

Knowledge graphs (KGs) require continual updates as new information emerges, but neural embedding models suffer from catastrophic forgetting when learning new tasks sequentially. We evaluate Elastic Weight Consolidation (EWC), a regularization-based continual learning method, on KG link prediction using TransE embeddings on FB15k-237. Across multiple experiments with five random seeds, we find that EWC reduces catastrophic forgetting from 12.62% to 6.85%, a 45.7% reduction compared to naive sequential training. We observe that the task partitioning strategy affects the magnitude of forgetting: relation-based partitioning (grouping triples by relation type) exhibits 9.8 percentage points higher forgetting than randomly partitioned tasks (12.62% vs 2.81%), suggesting that task construction influences evaluation outcomes. While focused on a single embedding model and dataset, our results demonstrate that EWC effectively mitigates catastrophic forgetting in KG continual learning and highlight the importance of evaluation protocol design.

Elastic Weight Consolidation for Knowledge Graph Continual Learning: An Empirical Evaluation

TL;DR

This study evaluates Elastic Weight Consolidation (EWC) for continual learning in knowledge graph link prediction using TransE on FB15k-237. It shows that EWC substantially reduces catastrophic forgetting across sequential tasks, with the strongest setting (λ=10) lowering forgetting from 12.62% to 6.85% and improving final MRR. The results also reveal that how tasks are constructed (relation-based vs random partitioning) greatly affects forgetting, underscoring the need for careful evaluation protocols. The work highlights EWC as an effective regularization-based approach for KG continual learning, while calling for broader generalization studies across datasets, embeddings, and task sequences.

Abstract

Knowledge graphs (KGs) require continual updates as new information emerges, but neural embedding models suffer from catastrophic forgetting when learning new tasks sequentially. We evaluate Elastic Weight Consolidation (EWC), a regularization-based continual learning method, on KG link prediction using TransE embeddings on FB15k-237. Across multiple experiments with five random seeds, we find that EWC reduces catastrophic forgetting from 12.62% to 6.85%, a 45.7% reduction compared to naive sequential training. We observe that the task partitioning strategy affects the magnitude of forgetting: relation-based partitioning (grouping triples by relation type) exhibits 9.8 percentage points higher forgetting than randomly partitioned tasks (12.62% vs 2.81%), suggesting that task construction influences evaluation outcomes. While focused on a single embedding model and dataset, our results demonstrate that EWC effectively mitigates catastrophic forgetting in KG continual learning and highlight the importance of evaluation protocol design.

Paper Structure

This paper contains 19 sections, 7 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Catastrophic forgetting on relation-based partitioned tasks. EWC ($\lambda=10$) substantially reduces forgetting compared to naive sequential training and replay-based methods.
  • Figure 2: Effect of task partitioning on forgetting. Relation-based partitioning creates more challenging continual learning scenarios, with 9.8 percentage points of higher forgetting in naive training compared to random partitioning.
  • Figure 3: Performance-forgetting trade-off. EWC ($\lambda=10$) achieves the best balance, with low forgetting (6.85%) and competitive final MRR (0.242).
  • Figure 4: Task retention matrix for EWC ($\lambda=10$). The diagonal shows performance immediately after learning each task; the line below the diagonal shows retention after subsequent tasks. Minimal degradation indicates effective continual learning.