Table of Contents
Fetching ...

Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction

Yaojie Lu, Hongyu Lin, Jin Xu, Xianpei Han, Jialong Tang, Annan Li, Le Sun, Meng Liao, Shaoyi Chen

TL;DR

This work reframes event extraction as end-to-end sequence-to-structure generation, introducing Text2Event which linearizes event structures and uses a transformer encoder-decoder with trie-based constrained decoding and curriculum learning. It demonstrates that models can learn from coarse sentence to event record annotations and transfer knowledge across event types, achieving competitive performance on ACE and ERE benchmarks in both supervised and transfer settings. The approach reduces annotation requirements and enables flexible, data-efficient end-to-end information extraction with strong transfer capabilities. Overall, Text2Event presents a practical path for unified event extraction and potential extension to broader structure prediction tasks.

Abstract

Event extraction is challenging due to the complex structure of event records and the semantic gap between text and event. Traditional methods usually extract event records by decomposing the complex structure prediction task into multiple subtasks. In this paper, we propose Text2Event, a sequence-to-structure generation paradigm that can directly extract events from the text in an end-to-end manner. Specifically, we design a sequence-to-structure network for unified event extraction, a constrained decoding algorithm for event knowledge injection during inference, and a curriculum learning algorithm for efficient model learning. Experimental results show that, by uniformly modeling all tasks in a single model and universally predicting different labels, our method can achieve competitive performance using only record-level annotations in both supervised learning and transfer learning settings.

Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction

TL;DR

This work reframes event extraction as end-to-end sequence-to-structure generation, introducing Text2Event which linearizes event structures and uses a transformer encoder-decoder with trie-based constrained decoding and curriculum learning. It demonstrates that models can learn from coarse sentence to event record annotations and transfer knowledge across event types, achieving competitive performance on ACE and ERE benchmarks in both supervised and transfer settings. The approach reduces annotation requirements and enables flexible, data-efficient end-to-end information extraction with strong transfer capabilities. Overall, Text2Event presents a practical path for unified event extraction and potential extension to broader structure prediction tasks.

Abstract

Event extraction is challenging due to the complex structure of event records and the semantic gap between text and event. Traditional methods usually extract event records by decomposing the complex structure prediction task into multiple subtasks. In this paper, we propose Text2Event, a sequence-to-structure generation paradigm that can directly extract events from the text in an end-to-end manner. Specifically, we design a sequence-to-structure network for unified event extraction, a constrained decoding algorithm for event knowledge injection during inference, and a curriculum learning algorithm for efficient model learning. Experimental results show that, by uniformly modeling all tasks in a single model and universally predicting different labels, our method can achieve competitive performance using only record-level annotations in both supervised learning and transfer learning settings.

Paper Structure

This paper contains 18 sections, 4 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The framework of Text2Event. Here, Text2Event takes raw text as input and generates a Transport event and an Arrest-Jail event.
  • Figure 2: Examples of three event representations. The red solid line indicates the event-role relation; the blue dotted line indicates the label-span relation where the head is a label and the tail is a text span. For example, "Transport-returned" is a label-span relation edge, which head is "Transport" and tail is "returned".
  • Figure 3: The prefix tree (trie) of the constrained decoding algorithm for controllable structure generation. ${\color{red}\mathcal{T}}$ and ${\color{red}\mathcal{R}}$ indicate the label name of event type and argument role. ${\color{blue}\mathcal{S}}$ indicates the text span in the raw text, which is the event trigger or argument mention of the extracted event.