Table of Contents
Fetching ...

Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs

Fatemeh Shiri, Van Nguyen, Farhad Moghimifar, John Yoo, Gholamreza Haffari, Yuan-Fang Li

TL;DR

The paper tackles hallucination in LLM-based event extraction by decomposing the task into Event Detection ($ED$) and Event Argument Extraction ($EAE$) and by enriching prompts with schema-aware granular instructions and dynamic Retrieval-Augmented Examples ($RAE$). A two-stage pipeline retrieves top-$K$ similar training instances via embeddings (using FAISS and $IndexFlatL2$) and uses these retrieved exemplars to augment prompts for both ED and EAE. Evaluations on ACE05-EN, WikiEvents, and a new MaritimeEvent benchmark show consistent improvements over baselines, with notable gains in $Trig{-}C$ and $Arg{-}C$ F1 scores, and demonstrate the value of decomposition and retrieval augmentation, especially in low-resource or domain-adaptation scenarios. The findings highlight the practical impact for automatic knowledge-graph construction and decision-support systems, where reliable, scalable EE from large text corpora is crucial.

Abstract

Large Language Models (LLMs) demonstrate significant capabilities in processing natural language data, promising efficient knowledge extraction from diverse textual sources to enhance situational awareness and support decision-making. However, concerns arise due to their susceptibility to hallucination, resulting in contextually inaccurate content. This work focuses on harnessing LLMs for automated Event Extraction, introducing a new method to address hallucination by decomposing the task into Event Detection and Event Argument Extraction. Moreover, the proposed method integrates dynamic schema-aware augmented retrieval examples into prompts tailored for each specific inquiry, thereby extending and adapting advanced prompting techniques such as Retrieval-Augmented Generation. Evaluation findings on prominent event extraction benchmarks and results from a synthesized benchmark illustrate the method's superior performance compared to baseline approaches.

Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs

TL;DR

The paper tackles hallucination in LLM-based event extraction by decomposing the task into Event Detection () and Event Argument Extraction () and by enriching prompts with schema-aware granular instructions and dynamic Retrieval-Augmented Examples (). A two-stage pipeline retrieves top- similar training instances via embeddings (using FAISS and ) and uses these retrieved exemplars to augment prompts for both ED and EAE. Evaluations on ACE05-EN, WikiEvents, and a new MaritimeEvent benchmark show consistent improvements over baselines, with notable gains in and F1 scores, and demonstrate the value of decomposition and retrieval augmentation, especially in low-resource or domain-adaptation scenarios. The findings highlight the practical impact for automatic knowledge-graph construction and decision-support systems, where reliable, scalable EE from large text corpora is crucial.

Abstract

Large Language Models (LLMs) demonstrate significant capabilities in processing natural language data, promising efficient knowledge extraction from diverse textual sources to enhance situational awareness and support decision-making. However, concerns arise due to their susceptibility to hallucination, resulting in contextually inaccurate content. This work focuses on harnessing LLMs for automated Event Extraction, introducing a new method to address hallucination by decomposing the task into Event Detection and Event Argument Extraction. Moreover, the proposed method integrates dynamic schema-aware augmented retrieval examples into prompts tailored for each specific inquiry, thereby extending and adapting advanced prompting techniques such as Retrieval-Augmented Generation. Evaluation findings on prominent event extraction benchmarks and results from a synthesized benchmark illustrate the method's superior performance compared to baseline approaches.
Paper Structure (15 sections, 6 figures, 3 tables)

This paper contains 15 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: An example of enriched prompt for event extraction using GPT-4. GPT-4 is tasked with query instances based on provided instructions, event type definitions, output format, and retrieval-augmented examples in this scenario. These examples are the most similar instances to the query instance retrieved from the existing knowledge base. GPT-4 is expected to produce responses for each query instance without any prior training on the specific task or data. (For simplicity, we only show the first step, Event Detection).
  • Figure 2: An illustration of our end-to-end framework for event extraction, which performs event detection and event argument extraction jointly. The details of the Event Detection Prompt and Argument Extraction Prompt are shown in Figures \ref{['fig:ed_prompt']} and \ref{['fig:eae_prompt']}.
  • Figure 3: An illustration of our granular Instructions and placeholder for retrieval augmented examples within the ED prompt
  • Figure 4: An illustration of our granular Instructions and placeholder for retrieval augmented examples within the EAE prompt
  • Figure 5: Prompt template for synthesizing maritime reports.
  • ...and 1 more figures