EntGPT: Entity Linking with Generative Large Language Models
Yifan Ding, Amrit Poudel, Qingkai Zeng, Tim Weninger, Balaji Veeramani, Sanmitra Bhattacharya
TL;DR
EntGPT addresses entity linking by grounding mentions to knowledge base entries using two large-language-model–driven strategies: a three-step prompt-based pipeline (EntGPT-P) and instruction-tuning (EntGPT-I). EntGPT-P achieves substantial zero-shot gains (up to micro-$F_1$ improvement of 36%), while EntGPT-I delivers consistent supervised gains (average micro-$F_1$ +2.1%) and state-of-the-art zero-shot QA across six benchmarks. The methods demonstrate robust performance across ten ED datasets with Wikipedia as the KB, without supervised fine-tuning, and the authors release data and code for reproducibility. This work highlights the value of explicit knowledge grounding and task-specific instruction tuning in improving ED and related QA tasks, with practical impact for scalable knowledge-grounded NLP systems.
Abstract
Entity Linking in natural language processing seeks to match text entities to their corresponding entries in a dictionary or knowledge base. Traditional approaches rely on contextual models, which can be complex, hard to train, and have limited transferability across different domains. Generative large language models like GPT offer a promising alternative but often underperform with naive prompts. In this study, we introduce EntGPT, employing advanced prompt engineering to enhance EL tasks. Our three-step hard-prompting method (EntGPT-P) significantly boosts the micro-F_1 score by up to 36% over vanilla prompts, achieving competitive performance across 10 datasets without supervised fine-tuning. Additionally, our instruction tuning method (EntGPT-I) improves micro-F_1 scores by 2.1% on average in supervised EL tasks and outperforms several baseline models in six Question Answering tasks. Our methods are compatible with both open-source and proprietary LLMs. All data and code are available on GitHub at https://github.com/yifding/In_Context_EL.
