Wiki-TabNER: Integrating Named Entity Recognition into Wikipedia Tables
Aneta Koleva, Martin Ringsquandl, Ahmed Hatem, Thomas Runkler, Volker Tresp
TL;DR
This work introduces Wiki-TabNER, a benchmark dataset that brings real-world Wikipedia tables into NER evaluation by preserving multi-entity cells and annotating entities with DBpedia semantic classes and Wikidata IDs. It details dataset construction from the WikiTables corpus, including table filtering, entity extraction, semantic annotation, and span-based labeling embedded in the table structure. The paper also proposes a prompting-based evaluation framework for large language models to perform within-table NER, accompanied by qualitative analysis and ablation studies that reveal challenges in type prediction and label granularity. Findings show that while in-context learning improves NER within tables, the task remains difficult for current LLMs, underscoring the need for robust table NER benchmarks; the dataset is released to foster further research, with future work including multi-label classification and EL integration.
Abstract
Interest in solving table interpretation tasks has grown over the years, yet it still relies on existing datasets that may be overly simplified. This is potentially reducing the effectiveness of the dataset for thorough evaluation and failing to accurately represent tables as they appear in the real-world. To enrich the existing benchmark datasets, we extract and annotate a new, more challenging dataset. The proposed Wiki-TabNER dataset features complex tables containing several entities per cell, with named entities labeled using DBpedia classes. This dataset is specifically designed to address named entity recognition (NER) task within tables, but it can also be used as a more challenging dataset for evaluating the entity linking task. In this paper we describe the distinguishing features of the Wiki-TabNER dataset and the labeling process. In addition, we propose a prompting framework for evaluating the new large language models on the within tables NER task. Finally, we perform qualitative analysis to gain insights into the challenges encountered by the models and to understand the limitations of the proposed~dataset.
