Table of Contents
Fetching ...

All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing

Advait Deshmukh, Ashwin Umadi, Dananjay Srinivas, Maria Leonor Pacheco

TL;DR

This paper interrogates ultra-fine entity typing (UFET) under a long-tail regime where rare entities are underrepresented in pre-training data. It proposes a simple, practical proxy for pre-training frequency based on Google search hits and demonstrates that PLM-derived probabilities strongly correlate with this proxy across multiple models, validating the proxy as reflective of pre-training exposure. Through a comparative benchmark of seven models (both PLM-only and knowledge-infused) on UFET and OntoNotes, the study finds that PLMs struggle on infrequent entities, while knowledge-infused approaches—such as LITE, which leverages label dependencies via an NLI framework—are more robust to frequency shifts. The findings advocate for integrating external resources and structured knowledge into UFET systems to improve long-tail performance, guiding future work toward more knowledge-aware typing solutions with practical impact for real-world entity recognition tasks.

Abstract

Due to their capacity to acquire world knowledge from large corpora, pre-trained language models (PLMs) are extensively used in ultra-fine entity typing tasks where the space of labels is extremely large. In this work, we explore the limitations of the knowledge acquired by PLMs by proposing a novel heuristic to approximate the pre-training distribution of entities when the pre-training data is unknown. Then, we systematically demonstrate that entity-typing approaches that rely solely on the parametric knowledge of PLMs struggle significantly with entities at the long tail of the pre-training distribution, and that knowledge-infused approaches can account for some of these shortcomings. Our findings suggest that we need to go beyond PLMs to produce solutions that perform well for infrequent entities.

All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing

TL;DR

This paper interrogates ultra-fine entity typing (UFET) under a long-tail regime where rare entities are underrepresented in pre-training data. It proposes a simple, practical proxy for pre-training frequency based on Google search hits and demonstrates that PLM-derived probabilities strongly correlate with this proxy across multiple models, validating the proxy as reflective of pre-training exposure. Through a comparative benchmark of seven models (both PLM-only and knowledge-infused) on UFET and OntoNotes, the study finds that PLMs struggle on infrequent entities, while knowledge-infused approaches—such as LITE, which leverages label dependencies via an NLI framework—are more robust to frequency shifts. The findings advocate for integrating external resources and structured knowledge into UFET systems to improve long-tail performance, guiding future work toward more knowledge-aware typing solutions with practical impact for real-world entity recognition tasks.

Abstract

Due to their capacity to acquire world knowledge from large corpora, pre-trained language models (PLMs) are extensively used in ultra-fine entity typing tasks where the space of labels is extremely large. In this work, we explore the limitations of the knowledge acquired by PLMs by proposing a novel heuristic to approximate the pre-training distribution of entities when the pre-training data is unknown. Then, we systematically demonstrate that entity-typing approaches that rely solely on the parametric knowledge of PLMs struggle significantly with entities at the long tail of the pre-training distribution, and that knowledge-infused approaches can account for some of these shortcomings. Our findings suggest that we need to go beyond PLMs to produce solutions that perform well for infrequent entities.

Paper Structure

This paper contains 23 sections, 1 equation, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Entity distribution across UFET test bins
  • Figure 2: Baseline vs. Knowledge-enhanced Performance across test bins
  • Figure 3: Effect of scaling on performance across UFET bins
  • Figure 4: Average UFET entity recovery probability versus average number of tokens per word for three model tokenizers
  • Figure 5: Average number of tokens for UFET test bins
  • ...and 3 more figures