Table of Contents
Fetching ...

Unified Examination of Entity Linking in Absence of Candidate Sets

Nicolas Ong, Hassan Shavarani, Anoop Sarkar

TL;DR

Entity linking is evaluated under variable candidate generation settings, a problem of practical importance where systems must cope without predefined candidate sets. To address this, the paper introduces a GERBIL-based unified black-box benchmarking workflow and a gerbil_connect middleware to standardize inputs/outputs across diverse EL methods on CoNLL/AIDA D11-1072 and a newly annotated AIDA/testc set. A core contribution is the candidate set ablation study, where hand-crafted candidate sets are removed or replaced by the full in-domain vocabulary (5598 entities), showing that the majority of methods depend strongly on candidate sets and suffer substantial performance drops without them. Generative models tend to be more resilient to candidate-set absence than discriminative, candidate-based approaches, while expanding the candidate pool increases inference time by orders of magnitude in several systems. Overall, the work highlights the need for robust, relevance-agnostic EL approaches and provides a publicly accessible benchmarking recipe to enable reproducible, cross-method evaluations.

Abstract

Despite remarkable strides made in the development of entity linking systems in recent years, a comprehensive comparative analysis of these systems using a unified framework is notably absent. This paper addresses this oversight by introducing a new black-box benchmark and conducting a comprehensive evaluation of all state-of-the-art entity linking methods. We use an ablation study to investigate the impact of candidate sets on the performance of entity linking. Our findings uncover exactly how much such entity linking systems depend on candidate sets, and how much this limits the general applicability of each system. We present an alternative approach to candidate sets, demonstrating that leveraging the entire in-domain candidate set can serve as a viable substitute for certain models. We show the trade-off between less restrictive candidate sets, increased inference time and memory footprint for some models.

Unified Examination of Entity Linking in Absence of Candidate Sets

TL;DR

Entity linking is evaluated under variable candidate generation settings, a problem of practical importance where systems must cope without predefined candidate sets. To address this, the paper introduces a GERBIL-based unified black-box benchmarking workflow and a gerbil_connect middleware to standardize inputs/outputs across diverse EL methods on CoNLL/AIDA D11-1072 and a newly annotated AIDA/testc set. A core contribution is the candidate set ablation study, where hand-crafted candidate sets are removed or replaced by the full in-domain vocabulary (5598 entities), showing that the majority of methods depend strongly on candidate sets and suffer substantial performance drops without them. Generative models tend to be more resilient to candidate-set absence than discriminative, candidate-based approaches, while expanding the candidate pool increases inference time by orders of magnitude in several systems. Overall, the work highlights the need for robust, relevance-agnostic EL approaches and provides a publicly accessible benchmarking recipe to enable reproducible, cross-method evaluations.

Abstract

Despite remarkable strides made in the development of entity linking systems in recent years, a comprehensive comparative analysis of these systems using a unified framework is notably absent. This paper addresses this oversight by introducing a new black-box benchmark and conducting a comprehensive evaluation of all state-of-the-art entity linking methods. We use an ablation study to investigate the impact of candidate sets on the performance of entity linking. Our findings uncover exactly how much such entity linking systems depend on candidate sets, and how much this limits the general applicability of each system. We present an alternative approach to candidate sets, demonstrating that leveraging the entire in-domain candidate set can serve as a viable substitute for certain models. We show the trade-off between less restrictive candidate sets, increased inference time and memory footprint for some models.
Paper Structure (27 sections, 1 equation, 2 figures, 3 tables)

This paper contains 27 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Entity linking error distribution in four categories of over-generated (gray, vertical), under-generated (red, horizontal), incorrect entity (teal, north east) and incorrect mention (blue, north west) before candidate set ablations (left) and after the ablations (right). The y-axis is the error analysis ratio as described below.
  • Figure 2: Entity linking micro precision (blue, north east) and recall (red, north west) score differences over testa between model's original configuration and candidate set ablation configuration.