NativE: Multi-modal Knowledge Graph Completion in the Wild
Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, Huajun Chen
TL;DR
NativE tackles the diversity and imbalance challenges of multi-modal knowledge graph completion in the wild by introducing two core components: Relatio n-guided Dual Adaptive Fusion (ReDAF), which enables adaptive, relation-aware fusion of arbitrary modalities, and Collaborative Modality Adversarial Training (CoMAT), which augments imbalanced modality information via Wasserstein GAN-based adversarial learning. The framework jointly learns multi-modal entity representations and leverages a RotatE-based score to assess triple plausibility, while a theoretical Lipschitz argument supports the adversarial design. A new WildKGC benchmark with five MMKGs demonstrates that NativE achieves state-of-the-art results across diverse datasets and modality configurations, and ablation studies confirm the importance of each module. Additional analyses show CoMAT’s generality across other MMKGC models and provide insights into efficiency and practical deployment, highlighting substantial improvements in real-world MMKGC tasks. Overall, NativE offers a scalable, generalizable approach to MMKGC in the wild, capable of leveraging broad modality spectra and coping with uneven modality distributions.
Abstract
Multi-modal knowledge graph completion (MMKGC) aims to automatically discover the unobserved factual knowledge from a given multi-modal knowledge graph by collaboratively modeling the triple structure and multi-modal information from entities. However, real-world MMKGs present challenges due to their diverse and imbalanced nature, which means that the modality information can span various types (e.g., image, text, numeric, audio, video) but its distribution among entities is uneven, leading to missing modalities for certain entities. Existing works usually focus on common modalities like image and text while neglecting the imbalanced distribution phenomenon of modal information. To address these issues, we propose a comprehensive framework NativE to achieve MMKGC in the wild. NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities and employs a collaborative modality adversarial training framework to augment the imbalanced modality information. We construct a new benchmark called WildKGC with five datasets to evaluate our method. The empirical results compared with 21 recent baselines confirm the superiority of our method, consistently achieving state-of-the-art performance across different datasets and various scenarios while keeping efficient and generalizable. Our code and data are released at https://github.com/zjukg/NATIVE
