Continual Multimodal Knowledge Graph Construction

Xiang Chen; Jintian Zhang; Xiaohan Wang; Ningyu Zhang; Tongtong Wu; Yuxiang Wang; Yongheng Wang; Huajun Chen

Continual Multimodal Knowledge Graph Construction

Xiang Chen, Jintian Zhang, Xiaohan Wang, Ningyu Zhang, Tongtong Wu, Yuxiang Wang, Yongheng Wang, Huajun Chen

TL;DR

This work tackles continual multimodal knowledge graph construction (MKGC), addressing the challenge of catastrophic forgetting as new entities and relations continually emerge. It introduces MSPT, a dual-stream Transformer framework that combines gradient modulation for balanced learning with hand-in-hand multimodal interaction and attention distillation to preserve past knowledge. The authors also establish incremental MKGC benchmarks (IMNER and IMRE) and show MSPT outperforms both multimodal MKGC baselines and traditional continual learning methods, with strong plasticity and robust stability. The study demonstrates that careful management of inter-modal learning dynamics and attention patterns yields superior performance in evolving multimodal knowledge environments, with practical implications for real-world streaming data scenarios.

Abstract

Current Multimodal Knowledge Graph Construction (MKGC) models struggle with the real-world dynamism of continuously emerging entities and relations, often succumbing to catastrophic forgetting-loss of previously acquired knowledge. This study introduces benchmarks aimed at fostering the development of the continual MKGC domain. We further introduce MSPT framework, designed to surmount the shortcomings of existing MKGC approaches during multimedia data processing. MSPT harmonizes the retention of learned knowledge (stability) and the integration of new data (plasticity), outperforming current continual learning and multimodal methods. Our results confirm MSPT's superior performance in evolving knowledge environments, showcasing its capacity to navigate balance between stability and plasticity.

Continual Multimodal Knowledge Graph Construction

TL;DR

Abstract

Paper Structure (30 sections, 15 equations, 6 figures, 3 tables)

This paper contains 30 sections, 15 equations, 6 figures, 3 tables.

Introduction
Related Works
Advancements in MKGC
Multimodal Named Entity Recognition.
Multimodal Relation Extraction.
Continual Knowledge Graph Construction
Preliminaries
Delineation of MKGC Tasks
Class-Incremental Continual Learning
Methodology
Framework Overview
Balanced Multimodal Learning Dynamics
Hand-in-hand Multimodal Interaction via Attention Distillation
Hand-in-hand Multimodal Interaction
Core Attention Distillation
...and 15 more sections

Figures (6)

Figure 1: Results on incremental MRE (IMRE) benchmark. We benchmark MSPT against the Vanilla Training approach, multimodal KGC models such as MEGA and MKGformer, as well as the continual RE method RP-CRE.
Figure 2: Overview of our MSPT framework.
Figure 3: Performance in plasticity on the IMRE Benchmark.
Figure 4: Change of contribution ratio $\gamma^{t}_{n}$ during training.
Figure 5: Analysis on rehearsal size.
...and 1 more figures

Theorems & Definitions (2)

Remark 1
Remark 2

Continual Multimodal Knowledge Graph Construction

TL;DR

Abstract

Continual Multimodal Knowledge Graph Construction

Authors

TL;DR

Abstract

Table of Contents

Figures (6)

Theorems & Definitions (2)