Zero-Shot Cross-Lingual Document-Level Event Causality Identification with Heterogeneous Graph Contrastive Transfer Learning
Zhitao He, Pengfei Cao, Zhuoran Jin, Yubo Chen, Kang Liu, Zhiqiang Zhang, Mengshu Sun, Jun Zhao
TL;DR
This work tackles document-level event causality identification (ECI) in zero-shot cross-lingual settings, focusing on low-resource languages. It introduces GIMC, a framework that combines a heterogeneous graph interaction network with a multi-granularity contrastive transfer learning module to align cross-lingual causal representations. Through extensive experiments on MECI, GIMC achieves substantial improvements in both monolingual and multilingual scenarios, outperforming previous state-of-the-art methods by notable margins and even surpassing GPT-3.5 in few-shot settings for multilingual ECI. The approach demonstrates that modeling long-distance event dependencies with a rich graph structure, coupled with cross-lingual pattern alignment, yields robust cross-lingual transfer for complex document-level reasoning tasks.
Abstract
Event Causality Identification (ECI) refers to the detection of causal relations between events in texts. However, most existing studies focus on sentence-level ECI with high-resource languages, leaving more challenging document-level ECI (DECI) with low-resource languages under-explored. In this paper, we propose a Heterogeneous Graph Interaction Model with Multi-granularity Contrastive Transfer Learning (GIMC) for zero-shot cross-lingual document-level ECI. Specifically, we introduce a heterogeneous graph interaction network to model the long-distance dependencies between events that are scattered over a document. Then, to improve cross-lingual transferability of causal knowledge learned from the source language, we propose a multi-granularity contrastive transfer learning module to align the causal representations across languages. Extensive experiments show our framework outperforms the previous state-of-the-art model by 9.4% and 8.2% of average F1 score on monolingual and multilingual scenarios respectively. Notably, in the multilingual scenario, our zero-shot framework even exceeds GPT-3.5 with few-shot learning by 24.3% in overall performance.
