Leveraging Intra-modal and Inter-modal Interaction for Multi-Modal Entity Alignment

Zhiwei Hu; Víctor Gutiérrez-Basulto; Zhiliang Xiang; Ru Li; Jeff Z. Pan

Leveraging Intra-modal and Inter-modal Interaction for Multi-Modal Entity Alignment

Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

TL;DR

This work tackles multi-modal entity alignment across heterogeneous MMKGs by introducing MIMEA, a four-module framework that explicitly models intra-modal and inter-modal interactions. It combines a multi-modal knowledge embedding stage (via GAT and MLPs) with a probabilistic fusion (Beta distributions), optimal transport-based alignment, and modal-adaptive contrastive learning to produce robust, joint-modal representations. Extensive experiments on FB15K-DB15K and FB15K-YAGO15K demonstrate state-of-the-art performance and robustness to seed availability, with ablations confirming the critical role of structural information and inter-modal dynamics. Overall, MIMEA advances MMKG integration by systematically exploiting multi-granular interactions, achieving strong alignment accuracy while maintaining favorable computational efficiency; future work includes addressing incomplete structural knowledge through KG completion.

Abstract

Multi-modal entity alignment (MMEA) aims to identify equivalent entity pairs across different multi-modal knowledge graphs (MMKGs). Existing approaches focus on how to better encode and aggregate information from different modalities. However, it is not trivial to leverage multi-modal knowledge in entity alignment due to the modal heterogeneity. In this paper, we propose a Multi-Grained Interaction framework for Multi-Modal Entity Alignment (MIMEA), which effectively realizes multi-granular interaction within the same modality or between different modalities. MIMEA is composed of four modules: i) a Multi-modal Knowledge Embedding module, which extracts modality-specific representations with multiple individual encoders; ii) a Probability-guided Modal Fusion module, which employs a probability guided approach to integrate uni-modal representations into joint-modal embeddings, while considering the interaction between uni-modal representations; iii) an Optimal Transport Modal Alignment module, which introduces an optimal transport mechanism to encourage the interaction between uni-modal and joint-modal embeddings; iv) a Modal-adaptive Contrastive Learning module, which distinguishes the embeddings of equivalent entities from those of non-equivalent ones, for each modality. Extensive experiments conducted on two real-world datasets demonstrate the strong performance of MIMEA compared to the SoTA. Datasets and code have been submitted as supplementary materials.

Leveraging Intra-modal and Inter-modal Interaction for Multi-Modal Entity Alignment

TL;DR

Abstract

Leveraging Intra-modal and Inter-modal Interaction for Multi-Modal Entity Alignment

Authors

TL;DR

Abstract

Table of Contents

Figures (2)