A Reasoning Paradigm for Named Entity Recognition
Hui Huang, Yanping Chen, Ruizhang Huang, Chuan Lin, Yongbin Qin
TL;DR
ReasoningNER reframes Named Entity Recognition as an explicit reasoning task, replacing opaque pattern-matching with a structured chain-of-thought approach. It introduces a three-stage pipeline (CoT Generation, CoT Tuning, Reasoning Enhancement) backed by an NER-CoT corpus and a GRPO-based reinforcement objective to optimize reasoning quality and extraction accuracy. Across zero-shot, few-shot, and cross-domain benchmarks, ReasoningNER achieves state-of-the-art performance, demonstrates strong cross-lingual transfer, and shows data-efficient learning even with modest CoT supervision. The work highlights the value of reasoning-oriented information extraction and lays groundwork for extending explicit reasoning to broader universal information extraction, while noting latency as a current trade-off and suggesting CoT compression and broader task coverage as future directions.
Abstract
Generative LLMs typically improve Named Entity Recognition (NER) performance through instruction tuning. They excel at generating entities by semantic pattern matching but lack an explicit, verifiable reasoning mechanism. This "cognitive shortcutting" leads to suboptimal performance and brittle generalization, especially in zero-shot and lowresource scenarios where reasoning from limited contextual cues is crucial. To address this issue, a reasoning framework is proposed for NER, which shifts the extraction paradigm from implicit pattern matching to explicit reasoning. This framework consists of three stages: Chain of Thought (CoT) generation, CoT tuning, and reasoning enhancement. First, a dataset annotated with NER-oriented CoTs is generated, which contain task-relevant reasoning chains. Then, they are used to tune the NER model to generate coherent rationales before deriving the final answer. Finally, a reasoning enhancement stage is implemented to optimize the reasoning process using a comprehensive reward signal. This stage ensures explicit and verifiable extractions. Experiments show that ReasoningNER demonstrates impressive cognitive ability in the NER task, achieving competitive performance. In zero-shot settings, it achieves state-of-the-art (SOTA) performance, outperforming GPT-4 by 12.3 percentage points on the F1 score. Analytical results also demonstrate its great potential to advance research in reasoningoriented information extraction. Our codes are available at https://github.com/HuiResearch/ReasoningIE.
