Table of Contents
Fetching ...

ChatEL: Entity Linking with Chatbots

Yifan Ding, Qingkai Zeng, Tim Weninger

TL;DR

This work tackles entity disambiguation by leveraging Large Language Models through a three-step prompting framework (ChatEL) that bypasses supervised fine-tuning. It generates a compact candidate set via Prior and BLINK, augments mentions with document-informed auxiliary content, and uses a multi-choice prompt to select the best candidate using descriptors from candidate Wikipedia pages. Across ten datasets, GPT-4-based ChatEL achieves a +2.2% micro-F1 improvement on average over the previous state-of-the-art, with notable gains in out-of-domain benchmarks, and exhibits robustness without annotated training data. The study also provides thorough error analyses showing ground-truth label inconsistencies and discusses how evaluation may understate true performance, while releasing all data and code for reproducibility and broad applicability to real-world EL tasks.

Abstract

Entity Linking (EL) is an essential and challenging task in natural language processing that seeks to link some text representing an entity within a document or sentence with its corresponding entry in a dictionary or knowledge base. Most existing approaches focus on creating elaborate contextual models that look for clues the words surrounding the entity-text to help solve the linking problem. Although these fine-tuned language models tend to work, they can be unwieldy, difficult to train, and do not transfer well to other domains. Fortunately, Large Language Models (LLMs) like GPT provide a highly-advanced solution to the problems inherent in EL models, but simply naive prompts to LLMs do not work well. In the present work, we define ChatEL, which is a three-step framework to prompt LLMs to return accurate results. Overall the ChatEL framework improves the average F1 performance across 10 datasets by more than 2%. Finally, a thorough error analysis shows many instances with the ground truth labels were actually incorrect, and the labels predicted by ChatEL were actually correct. This indicates that the quantitative results presented in this paper may be a conservative estimate of the actual performance. All data and code are available as an open-source package on GitHub at https://github.com/yifding/In_Context_EL.

ChatEL: Entity Linking with Chatbots

TL;DR

This work tackles entity disambiguation by leveraging Large Language Models through a three-step prompting framework (ChatEL) that bypasses supervised fine-tuning. It generates a compact candidate set via Prior and BLINK, augments mentions with document-informed auxiliary content, and uses a multi-choice prompt to select the best candidate using descriptors from candidate Wikipedia pages. Across ten datasets, GPT-4-based ChatEL achieves a +2.2% micro-F1 improvement on average over the previous state-of-the-art, with notable gains in out-of-domain benchmarks, and exhibits robustness without annotated training data. The study also provides thorough error analyses showing ground-truth label inconsistencies and discusses how evaluation may understate true performance, while releasing all data and code for reproducibility and broad applicability to real-world EL tasks.

Abstract

Entity Linking (EL) is an essential and challenging task in natural language processing that seeks to link some text representing an entity within a document or sentence with its corresponding entry in a dictionary or knowledge base. Most existing approaches focus on creating elaborate contextual models that look for clues the words surrounding the entity-text to help solve the linking problem. Although these fine-tuned language models tend to work, they can be unwieldy, difficult to train, and do not transfer well to other domains. Fortunately, Large Language Models (LLMs) like GPT provide a highly-advanced solution to the problems inherent in EL models, but simply naive prompts to LLMs do not work well. In the present work, we define ChatEL, which is a three-step framework to prompt LLMs to return accurate results. Overall the ChatEL framework improves the average F1 performance across 10 datasets by more than 2%. Finally, a thorough error analysis shows many instances with the ground truth labels were actually incorrect, and the labels predicted by ChatEL were actually correct. This indicates that the quantitative results presented in this paper may be a conservative estimate of the actual performance. All data and code are available as an open-source package on GitHub at https://github.com/yifding/In_Context_EL.
Paper Structure (23 sections, 3 figures, 6 tables)

This paper contains 23 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: (a) General pipeline for supervised information extraction systems. These systems require careful modeling of the mention-text and its context and are fine-tuned on a large language model (LLM). (b) ChatEL relinquishes the context modeling entirely to the LLM and, instead, directly prompts LLM with the mention, context and entity candidates. ChatEL obtains a mean F1 score of 0.795 over ten datasets compared to 0.773 from the previous SOTA model.
  • Figure 2: Pipeline of ChatEL framework: Given input document with the annotated mention, ChatEL first conducts (1) entity candidate generation step to obtain relevant entities. Then (2) an augmentation step is performed to obtain an auxiliary content of the annotated mention. Finally, (3) a multi-choice selection prompt is conducted to decide the corresponding entity of annotated mention.
  • Figure 3: Error case of ChatEL predicting Ministry of Defense (Iran) vs Ministry of Defense (Japan).