Table of Contents
Fetching ...

FusionAdapter for Few-Shot Relation Learning in Multimodal Knowledge Graphs

Ran Liu, Yuan Fang, Xiaoli Li

TL;DR

FusionAdapter tackles few-shot relation learning in multimodal knowledge graphs by introducing per-modality adapters and a diversity-preserving fusion mechanism. The approach maintains modality-specific information while integrating text and image signals through a lightweight adapter design, enabling rapid adaptation to unseen relations within a meta-learning framework. Empirical results on two MMKG benchmarks show strong, consistent improvements over both unimodal and multimodal baselines, driven by the diversity loss and parameter-efficient adapters. This work advances practical multimodal few-shot reasoning in knowledge graphs with notable gains in generalization and robustness.

Abstract

Multimodal Knowledge Graphs (MMKGs) incorporate various modalities, including text and images, to enhance entity and relation representations. Notably, different modalities for the same entity often present complementary and diverse information. However, existing MMKG methods primarily align modalities into a shared space, which tends to overlook the distinct contributions of specific modalities, limiting their performance particularly in low-resource settings. To address this challenge, we propose FusionAdapter for the learning of few-shot relationships (FSRL) in MMKG. FusionAdapter introduces (1) an adapter module that enables efficient adaptation of each modality to unseen relations and (2) a fusion strategy that integrates multimodal entity representations while preserving diverse modality-specific characteristics. By effectively adapting and fusing information from diverse modalities, FusionAdapter improves generalization to novel relations with minimal supervision. Extensive experiments on two benchmark MMKG datasets demonstrate that FusionAdapter achieves superior performance over state-of-the-art methods.

FusionAdapter for Few-Shot Relation Learning in Multimodal Knowledge Graphs

TL;DR

FusionAdapter tackles few-shot relation learning in multimodal knowledge graphs by introducing per-modality adapters and a diversity-preserving fusion mechanism. The approach maintains modality-specific information while integrating text and image signals through a lightweight adapter design, enabling rapid adaptation to unseen relations within a meta-learning framework. Empirical results on two MMKG benchmarks show strong, consistent improvements over both unimodal and multimodal baselines, driven by the diversity loss and parameter-efficient adapters. This work advances practical multimodal few-shot reasoning in knowledge graphs with notable gains in generalization and robustness.

Abstract

Multimodal Knowledge Graphs (MMKGs) incorporate various modalities, including text and images, to enhance entity and relation representations. Notably, different modalities for the same entity often present complementary and diverse information. However, existing MMKG methods primarily align modalities into a shared space, which tends to overlook the distinct contributions of specific modalities, limiting their performance particularly in low-resource settings. To address this challenge, we propose FusionAdapter for the learning of few-shot relationships (FSRL) in MMKG. FusionAdapter introduces (1) an adapter module that enables efficient adaptation of each modality to unseen relations and (2) a fusion strategy that integrates multimodal entity representations while preserving diverse modality-specific characteristics. By effectively adapting and fusing information from diverse modalities, FusionAdapter improves generalization to novel relations with minimal supervision. Extensive experiments on two benchmark MMKG datasets demonstrate that FusionAdapter achieves superior performance over state-of-the-art methods.

Paper Structure

This paper contains 18 sections, 6 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Diverse information across different modalities for the entity "Wichita Falls" in the FB-IMG dataset.
  • Figure 2: Illustration of key concepts in FusionAdapter. (a) The adapter module for modality fusion. (b) The fusion strategy during meta-testing. For brevity, we omit the meta-training stage, which is similar to meta-testing but with backpropagation of the task loss to update the model parameters.
  • Figure 3: Robustness analysis of the performance under different levels of missing modality data on FB-IMG, reporting MRR.
  • Figure 4: Few-shot size
  • Figure 5: Diversity coefficient
  • ...and 2 more figures