Table of Contents
Fetching ...

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

Luyao Wang, Pengnian Qi, Xigang Bao, Chunlai Zhou, Biao Qin

TL;DR

This work introduces a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way and proposes to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality.

Abstract

Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance.

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

TL;DR

This work introduces a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way and proposes to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality.

Abstract

Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance.
Paper Structure (21 sections, 12 equations, 3 figures, 2 tables)

This paper contains 21 sections, 12 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: An example of multi-modal entity alignment. The oval shape represents entities, and the diamond shape represents attribute values. The dotted line indicates the relation or attribute of the alignment of two aligned entities.
  • Figure 2: The overall architecture of PCMEA, which combines heterogeneous multi-modal attention-guided embedding and learns through MI maximization enhanced alignment loss (consists of alignment loss $\mathcal{L}_{AL}$ and MI maximization loss $\mathcal{L}_{MI}$), and momentum-based contrastive loss $\mathcal{L}_{CL}$.
  • Figure 3: Study on (a) momentum coefficient, (b) momentum network update span, (c) time of changing training strategy.