Table of Contents
Fetching ...

When and Where Did it Happen? An Encoder-Decoder Model to Identify Scenario Context

Enrique Noriega-Atala, Robert Vacareanu, Salena Torres Ashton, Adarsh Pyarelal, Clayton T. Morrison, Mihai Surdeanu

TL;DR

The findings suggest that a relatively small fine-tuned encoder-decoder model performs better than out-of-the-box LLMs and semantic role labeling parsers to accurate predict the relevant scenario information of a particular entity or event.

Abstract

We introduce a neural architecture finetuned for the task of scenario context generation: The relevant location and time of an event or entity mentioned in text. Contextualizing information extraction helps to scope the validity of automated finings when aggregating them as knowledge graphs. Our approach uses a high-quality curated dataset of time and location annotations in a corpus of epidemiology papers to train an encoder-decoder architecture. We also explored the use of data augmentation techniques during training. Our findings suggest that a relatively small fine-tuned encoder-decoder model performs better than out-of-the-box LLMs and semantic role labeling parsers to accurate predict the relevant scenario information of a particular entity or event.

When and Where Did it Happen? An Encoder-Decoder Model to Identify Scenario Context

TL;DR

The findings suggest that a relatively small fine-tuned encoder-decoder model performs better than out-of-the-box LLMs and semantic role labeling parsers to accurate predict the relevant scenario information of a particular entity or event.

Abstract

We introduce a neural architecture finetuned for the task of scenario context generation: The relevant location and time of an event or entity mentioned in text. Contextualizing information extraction helps to scope the validity of automated finings when aggregating them as knowledge graphs. Our approach uses a high-quality curated dataset of time and location annotations in a corpus of epidemiology papers to train an encoder-decoder architecture. We also explored the use of data augmentation techniques during training. Our findings suggest that a relatively small fine-tuned encoder-decoder model performs better than out-of-the-box LLMs and semantic role labeling parsers to accurate predict the relevant scenario information of a particular entity or event.

Paper Structure

This paper contains 15 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Example annotations in our dataset. Predicates highlighted in yellow represent 'events' with scenario context information assigned to them, text highlighted with cyan represents temporal context, and text in green location context. The arrows connect a context expression to an event they are associated with. The scenario context to event associations are effectively many-to-many relations. \ref{['fig:temporal']} shows a passage with several temporal scenario contexts and \ref{['fig:spatial']} shows several location scenario contexts.
  • Figure 2: Number of sentences between the relevant entity/event and its corresponding context.
  • Figure 3: Input prompt format used by the scenario context encoder-decoder model.
  • Figure 4: Output sequence format decoded by the scenario context encoder-decoder model.
  • Figure 5: Prompt used to elicit scenario context using an LLM.
  • ...and 2 more figures