Asking and Answering Questions to Extract Event-Argument Structures
Md Nayem Uddin, Enfa Rose George, Eduardo Blanco, Steven Corman
TL;DR
This work reframes document-level event-argument extraction as a question-answering task, introducing two question-generation paradigms (template- and transformer-based) and novel data augmentation strategies to address inter-sentential arguments. By leveraging transfer learning from existing corpora and a RoBERTa-based QA reader, the approach achieves competitive RAMS results, surpassing prior state-of-the-art when using transformer-generated questions and augmented data. Zero-/few-shot GPT-3 experiments show the supervised QA approach remains superior, while analyses reveal the method's strength in inter-sentential argument extraction and its vulnerabilities to annotation and coreference errors. Overall, the study demonstrates a scalable, generalizable framework for extracting rich event-argument structures across sentences, with meaningful implications for information extraction and downstream NLP tasks.
Abstract
This paper presents a question-answering approach to extract document-level event-argument structures. We automatically ask and answer questions for each argument type an event may have. Questions are generated using manually defined templates and generative transformers. Template-based questions are generated using predefined role-specific wh-words and event triggers from the context document. Transformer-based questions are generated using large language models trained to formulate questions based on a passage and the expected answer. Additionally, we develop novel data augmentation strategies specialized in inter-sentential event-argument relations. We use a simple span-swapping technique, coreference resolution, and large language models to augment the training instances. Our approach enables transfer learning without any corpora-specific modifications and yields competitive results with the RAMS dataset. It outperforms previous work, and it is especially beneficial to extract arguments that appear in different sentences than the event trigger. We also present detailed quantitative and qualitative analyses shedding light on the most common errors made by our best model.
