Table of Contents
Fetching ...

DynamicER: Resolving Emerging Mentions to Dynamic Entities for RAG

Jinyoung Kim, Dayoon Ko, Gunhee Kim

TL;DR

A temporal segmented clustering method with continual adaptation is proposed, effectively managing the temporal dynamics of evolving entities and emerging mentions, and outperforms existing baselines on QA task with resolved mentions.

Abstract

In the rapidly evolving landscape of language, resolving new linguistic expressions in continuously updating knowledge bases remains a formidable challenge. This challenge becomes critical in retrieval-augmented generation (RAG) with knowledge bases, as emerging expressions hinder the retrieval of relevant documents, leading to generator hallucinations. To address this issue, we introduce a novel task aimed at resolving emerging mentions to dynamic entities and present DynamicER benchmark. Our benchmark includes dynamic entity mention resolution and entity-centric knowledge-intensive QA task, evaluating entity linking and RAG model's adaptability to new expressions, respectively. We discovered that current entity linking models struggle to link these new expressions to entities. Therefore, we propose a temporal segmented clustering method with continual adaptation, effectively managing the temporal dynamics of evolving entities and emerging mentions. Extensive experiments demonstrate that our method outperforms existing baselines, enhancing RAG model performance on QA task with resolved mentions.

DynamicER: Resolving Emerging Mentions to Dynamic Entities for RAG

TL;DR

A temporal segmented clustering method with continual adaptation is proposed, effectively managing the temporal dynamics of evolving entities and emerging mentions, and outperforms existing baselines on QA task with resolved mentions.

Abstract

In the rapidly evolving landscape of language, resolving new linguistic expressions in continuously updating knowledge bases remains a formidable challenge. This challenge becomes critical in retrieval-augmented generation (RAG) with knowledge bases, as emerging expressions hinder the retrieval of relevant documents, leading to generator hallucinations. To address this issue, we introduce a novel task aimed at resolving emerging mentions to dynamic entities and present DynamicER benchmark. Our benchmark includes dynamic entity mention resolution and entity-centric knowledge-intensive QA task, evaluating entity linking and RAG model's adaptability to new expressions, respectively. We discovered that current entity linking models struggle to link these new expressions to entities. Therefore, we propose a temporal segmented clustering method with continual adaptation, effectively managing the temporal dynamics of evolving entities and emerging mentions. Extensive experiments demonstrate that our method outperforms existing baselines, enhancing RAG model performance on QA task with resolved mentions.

Paper Structure

This paper contains 25 sections, 4 equations, 3 figures, 19 tables.

Figures (3)

  • Figure 1: Motivation of our DynamicER benchmark. New mentions referring to the same entity are constantly created over time: as Shohei Ohtani transfers from the LA Angels to the LA Dodgers, he is referred to by new mentions such as ‘The Dodgers' number 17.’ We contribute a dynamic entity resolution dataset, along with two benchmark tests: traditional entity linking and entity-centric question-answering in the RAG context.
  • Figure 2: An illustrative example of TempCCA. $C$ denotes the representation of entity clusters formed in the previous time step. The rectangular boxes contain the entity input tokens, and the rounded boxes contain the input tokens for mention context. Entity names are highlighted, and mentions are underlined. TempCCA uses resolved mentions from the previous time step to form clusters, utilizing these cluster representations to resolve mentions in the subsequent time step. The attributes of entities that have changed are depicted in red text.
  • Figure 3: Mention variations in DynamicER