Table of Contents
Fetching ...

Improving Event Definition Following For Zero-Shot Event Detection

Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng

TL;DR

This work hypothesizes that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types.

Abstract

Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. To verify our hypothesis, we construct an automatically generated Diverse Event Definition (DivED) dataset and conduct comparative studies. Our experiments reveal that a large number of event types (200) and diverse event definitions can significantly boost event extraction performance; on the other hand, the performance does not scale with over ten examples per event type. Beyond scaling, we incorporate event ontology information and hard-negative samples during training, further boosting the performance. Based on these findings, we fine-tuned a LLaMA-2-7B model on our DivED dataset, yielding performance that surpasses SOTA large language models like GPT-3.5 across three open benchmarks on zero-shot event detection.

Improving Event Definition Following For Zero-Shot Event Detection

TL;DR

This work hypothesizes that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types.

Abstract

Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. To verify our hypothesis, we construct an automatically generated Diverse Event Definition (DivED) dataset and conduct comparative studies. Our experiments reveal that a large number of event types (200) and diverse event definitions can significantly boost event extraction performance; on the other hand, the performance does not scale with over ten examples per event type. Beyond scaling, we incorporate event ontology information and hard-negative samples during training, further boosting the performance. Based on these findings, we fine-tuned a LLaMA-2-7B model on our DivED dataset, yielding performance that surpasses SOTA large language models like GPT-3.5 across three open benchmarks on zero-shot event detection.
Paper Structure (45 sections, 3 figures, 14 tables)

This paper contains 45 sections, 3 figures, 14 tables.

Figures (3)

  • Figure 1: Zero-shot generative event detection formulation. We demonstrate a generated event type and sample from our DivED dataset. The input prompt includes information about Event Type, Event Definition, Event Ontology and the query passage, and the expected output is a verbalized extracted result.
  • Figure 2: Data generation pipeline to generate DivED dataset. The pipeline includes five main steps: (1) Event Type Name Retrieval: retrieve events from XPO overlap spaulding-etal-2023-joint; (2) Ontology-Aware Event Definition Curation: generate event type definitions for the event types retrieved from (1); (3) Ontology-Aware Sample Curation: generate samples for the retrieved event type names from (1) and event definition from (2); (4) Event Definition Expansion: Paraphrase and expand the event definition from (2), and (5) Ontology Pruning: Prune out events with high trigger overlap. Details of our prompt templates can be found in \ref{['sec:templates_for_data_generation']}.
  • Figure 3: The scaling of different dataset components. We train the models with different number of event types, event definitions per event type and samples per event type. After training, we further report the F1 scores on DivED -- Validation and ACE Validation set. Note that we do not report the DivED -- Validation score separately for sample scaling as we utilize the Geneva parekh2023geneva train set to explore sample scaling rather than DivED train set.