MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

Chunyang Li; Hao Peng; Xiaozhi Wang; Yunjia Qi; Lei Hou; Bin Xu; Juanzi Li

MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

Chunyang Li, Hao Peng, Xiaozhi Wang, Yunjia Qi, Lei Hou, Bin Xu, Juanzi Li

TL;DR

MAVEN-Fact introduces the largest large-scale Event Factuality Detection dataset, annotating 112,276 events with five factuality classes and supporting evidence, built on the MAVEN framework to include event types, arguments, and relations. The paper presents an LLM-then-Human annotation workflow to reduce labeling costs while maintaining high data quality, and provides rich annotations to support analyses of event elements and hallucination mitigation. Experimental results show MAVEN-Fact is challenging for both fine-tuned EFD models and LLMs, though incorporating event arguments and relations helps fine-tuned models, while prompting strategies partially improve LLM performance. A preliminary application demonstrates that explicitly injecting factuality information can mitigate event-related hallucinations in LLMs. The dataset and code are released to advance faithful event understanding research and applications.

Abstract

Event Factuality Detection (EFD) task determines the factuality of textual events, i.e., classifying whether an event is a fact, possibility, or impossibility, which is essential for faithfully understanding and utilizing event knowledge. However, due to the lack of high-quality large-scale data, event factuality detection is under-explored in event understanding research, which limits the development of EFD community. To address these issues and provide faithful event understanding, we introduce MAVEN-Fact, a large-scale and high-quality EFD dataset based on the MAVEN dataset. MAVEN-Fact includes factuality annotations of 112,276 events, making it the largest EFD dataset. Extensive experiments demonstrate that MAVEN-Fact is challenging for both conventional fine-tuned models and large language models (LLMs). Thanks to the comprehensive annotations of event arguments and relations in MAVEN, MAVEN-Fact also supports some further analyses and we find that adopting event arguments and relations helps in event factuality detection for fine-tuned models but does not benefit LLMs. Furthermore, we preliminarily study an application case of event factuality detection and find it helps in mitigating event-related hallucination in LLMs. Our dataset and codes can be obtained from \url{https://github.com/lcy2723/MAVEN-FACT}

MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

TL;DR

Abstract

Paper Structure (28 sections, 4 figures, 15 tables)

This paper contains 28 sections, 4 figures, 15 tables.

Introduction
Dataset Construction
Task Formulation
LLM-then-Human Annotation Approach
LLM Annotation
Human Annotation
Data Analysis
Experiment
Experimental Setup
Baselines
Evaluation Setup
Experimental Results
Supporting Evidence Prediction
Analysis on Task Interaction
Mitigating Event-related Hallucinations
...and 13 more sections

Figures (4)

Figure 1: An example of event understanding. The event "play" is factual while the events "win" and "celebrate" are just possibilities considering the word "might".
Figure 2: An illustration of four factuality classes. Uu denotes factuality can not be determined by the given context and is not shown in the figure.
Figure 3: Prompt used in LLM pre-annotation.
Figure 4: Screenshot for the annotation platform. The trigger word "siege" is selected for annotation, highlighted in yellow. Events related to it are highlighted in blue and green based on their relation type.

MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

TL;DR

Abstract

MAVEN-Fact: A Large-scale Event Factuality Detection Dataset

Authors

TL;DR

Abstract

Table of Contents

Figures (4)