Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering
Baiyan Zhang, Qin Chen, Jie Zhou, Jian Jin, Liang He
TL;DR
This paper addresses document-level event causality identification (DECI) by reframing the task as structure-aware causal question answering using a decoder-only large language model. It introduces a multi-task framework that (i) generates cause/effect options via MCQ, (ii) produces rationales to explain causal relations, and (iii) linearizes an event causality graph (ECG) to capture multi-hop structure. The approach leverages text clipping and carefully constructed options, combines Q→A, Q→R, and Q→G tasks with a joint loss, and uses LoRA-based PEFT for efficiency. On two benchmarks, EventStoryLine and Causal-TimeBank, it achieves competitive results with discriminative models and substantially better performance than other generative methods, while providing interpretability through generated rationales and ECG reasoning, thereby offering a practical baseline for future generative DECI work.
Abstract
Document-level Event Causality Identification (DECI) aims to identify causal relations between two events in documents. Recent research tends to use pre-trained language models to generate the event causal relations. Whereas, these methods are prone to the errors of sequential generation due to multiple events in a document. Moreover, the potential structures such as event coreference and related causal chain are neglected. In this paper, we propose a multi-task learning framework to enhance event causality identification with rationale and structure-aware causal question answering. Specifically, the DECI task is transformed into multiple-choice question answering, and the causes and effects of the questioned event are generated with large language models. In addition, we generate the rationales to explain why these events have causal relations. Moreover, we construct an event structure graph, which models the multi-hop potential relations for causal reasoning of the current event. Experiments on two benchmark datasets show the great advantages of our proposed approach compared to the state-of-the-art methods. Moreover, we conduct both quantitative and qualitative analyses, which shed light on why each component of our approach can lead to great improvements.
