Table of Contents
Fetching ...

Neon: News Entity-Interaction Extraction for Enhanced Question Answering

Sneha Singhania, Silviu Cucerzan, Allen Herring, Sujay Kumar Jauhar

TL;DR

The NEON framework is presented, designed to extract emerging entity interactions -- such as events or activities -- as described in news articles, by integrating open Information Extraction style tuples into LLMs to enable in-context retrieval-augmented generation.

Abstract

Capturing fresh information in near real-time and using it to augment existing large language models (LLMs) is essential to generate up-to-date, grounded, and reliable output. This problem becomes particularly challenging when LLMs are used for informational tasks in rapidly evolving fields, such as Web search related to recent or unfolding events involving entities, where generating temporally relevant responses requires access to up-to-the-hour news sources. However, the information modeled by the parametric memory of LLMs is often outdated, and Web results from prototypical retrieval systems may fail to capture the latest relevant information and struggle to handle conflicting reports in evolving news. To address this challenge, we present the NEON framework, designed to extract emerging entity interactions -- such as events or activities -- as described in news articles. NEON constructs an entity-centric timestamped knowledge graph that captures such interactions, thereby facilitating enhanced QA capabilities related to news events. Our framework innovates by integrating open Information Extraction (openIE) style tuples into LLMs to enable in-context retrieval-augmented generation. This integration demonstrates substantial improvements in QA performance when tackling temporal, entity-centric search queries. Through NEON, LLMs can deliver more accurate, reliable, and up-to-date responses.

Neon: News Entity-Interaction Extraction for Enhanced Question Answering

TL;DR

The NEON framework is presented, designed to extract emerging entity interactions -- such as events or activities -- as described in news articles, by integrating open Information Extraction style tuples into LLMs to enable in-context retrieval-augmented generation.

Abstract

Capturing fresh information in near real-time and using it to augment existing large language models (LLMs) is essential to generate up-to-date, grounded, and reliable output. This problem becomes particularly challenging when LLMs are used for informational tasks in rapidly evolving fields, such as Web search related to recent or unfolding events involving entities, where generating temporally relevant responses requires access to up-to-the-hour news sources. However, the information modeled by the parametric memory of LLMs is often outdated, and Web results from prototypical retrieval systems may fail to capture the latest relevant information and struggle to handle conflicting reports in evolving news. To address this challenge, we present the NEON framework, designed to extract emerging entity interactions -- such as events or activities -- as described in news articles. NEON constructs an entity-centric timestamped knowledge graph that captures such interactions, thereby facilitating enhanced QA capabilities related to news events. Our framework innovates by integrating open Information Extraction (openIE) style tuples into LLMs to enable in-context retrieval-augmented generation. This integration demonstrates substantial improvements in QA performance when tackling temporal, entity-centric search queries. Through NEON, LLMs can deliver more accurate, reliable, and up-to-date responses.

Paper Structure

This paper contains 13 sections, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Example for entity-centric, time-specific QA. Graph shows search interest for Doja Cat over a four-month period. The bottom part illustrates response generation at one of the peaks (31 August) using three different techniques: (i) zero-shot prompting, (ii) news snippets based prompting, (iii) augmenting tuples from our Neon graph for enhanced answer generation.
  • Figure 2: Implemented temporal QA pipeline
  • Figure 3: Coverage for 50 entities in 500 diverse news sources over a period of one year (2023)
  • Figure 4: Performance comparison with varying top-$k$ parameter, temporal retrieval and few-shot prompts.
  • Figure A1: Prompt for Neon($\mathcal{M}_1$) construction.
  • ...and 1 more figures