Table of Contents
Fetching ...

ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement

Kangyang Luo, Yuzhuo Bai, Shuzheng Si, Cheng Gao, Zhitong Wang, Yingli Shen, Wenhao Li, Zhu Liu, Yufeng Han, Jiayi Wu, Cunliang Kong, Maosong Sun

TL;DR

ImCoref-CeS tackles coreference resolution by marrying a lightweight, high-performance supervised CR model with the reasoning capabilities of large language models. The framework enhances the supervised backbone (ImCoref) with a Long-Text Encoding Bridging Module, a biaffine end-to-end scoring mechanism, and Hybrid Mention Regularization to improve efficiency and long-range mention handling. An LLM-based Checker-Splitter is dynamically integrated during inference to validate mentions and refine clusters, using structured prompts and filtering to manage cost. Across OntoNotes, LitBank, and WikiCoref, ImCoref-CeS demonstrates superior coreference performance and better generalization, while offering practical trade-offs between accuracy and latency for real-world deployment.

Abstract

Coreference Resolution (CR) is a critical task in Natural Language Processing (NLP). Current research faces a key dilemma: whether to further explore the potential of supervised neural methods based on small language models, whose detect-then-cluster pipeline still delivers top performance, or embrace the powerful capabilities of Large Language Models (LLMs). However, effectively combining their strengths remains underexplored. To this end, we propose \textbf{ImCoref-CeS}, a novel framework that integrates an enhanced supervised model with LLM-based reasoning. First, we present an improved CR method (\textbf{ImCoref}) to push the performance boundaries of the supervised neural method by introducing a lightweight bridging module to enhance long-text encoding capability, devising a biaffine scorer to comprehensively capture positional information, and invoking a hybrid mention regularization to improve training efficiency. Importantly, we employ an LLM acting as a multi-role Checker-Splitter agent to validate candidate mentions (filtering out invalid ones) and coreference results (splitting erroneous clusters) predicted by ImCoref. Extensive experiments demonstrate the effectiveness of ImCoref-CeS, which achieves superior performance compared to existing state-of-the-art (SOTA) methods.

ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement

TL;DR

ImCoref-CeS tackles coreference resolution by marrying a lightweight, high-performance supervised CR model with the reasoning capabilities of large language models. The framework enhances the supervised backbone (ImCoref) with a Long-Text Encoding Bridging Module, a biaffine end-to-end scoring mechanism, and Hybrid Mention Regularization to improve efficiency and long-range mention handling. An LLM-based Checker-Splitter is dynamically integrated during inference to validate mentions and refine clusters, using structured prompts and filtering to manage cost. Across OntoNotes, LitBank, and WikiCoref, ImCoref-CeS demonstrates superior coreference performance and better generalization, while offering practical trade-offs between accuracy and latency for real-world deployment.

Abstract

Coreference Resolution (CR) is a critical task in Natural Language Processing (NLP). Current research faces a key dilemma: whether to further explore the potential of supervised neural methods based on small language models, whose detect-then-cluster pipeline still delivers top performance, or embrace the powerful capabilities of Large Language Models (LLMs). However, effectively combining their strengths remains underexplored. To this end, we propose \textbf{ImCoref-CeS}, a novel framework that integrates an enhanced supervised model with LLM-based reasoning. First, we present an improved CR method (\textbf{ImCoref}) to push the performance boundaries of the supervised neural method by introducing a lightweight bridging module to enhance long-text encoding capability, devising a biaffine scorer to comprehensively capture positional information, and invoking a hybrid mention regularization to improve training efficiency. Importantly, we employ an LLM acting as a multi-role Checker-Splitter agent to validate candidate mentions (filtering out invalid ones) and coreference results (splitting erroneous clusters) predicted by ImCoref. Extensive experiments demonstrate the effectiveness of ImCoref-CeS, which achieves superior performance compared to existing state-of-the-art (SOTA) methods.

Paper Structure

This paper contains 21 sections, 6 equations, 6 figures, 20 tables.

Figures (6)

  • Figure 1: Combining the strengths of supervised neural methods and LLMs for CR, where LE. and DR. are low-resource efficiency and deep reasoning, respectively.
  • Figure 2: The overall pipeline of ImCoref-CeS. The dashed arrow indicates the process path where only ImCoref is executed; its generated mentions and coreference results are enclosed within the dashed box. It can be observed that relying solely on ImCoref during inference may produce invalid mentions (e.g., (man,1700,1700)). Furthermore, ImCoref inherits these invalid mentions, propagating them into erroneous coreference results. To mitigate these issues, we introduce a LLM as a multi-role Checker-Splitter agent, dynamically integrating it with ImCoref.
  • Figure 3: Avg.F1 (%) and Training Time (h) with varying concatenation strategies and $L_{\text{max}}$.
  • Figure 4: Avg.F1 (%) and Inference Time (m) with varying $\eta$.
  • Figure 5: Illustration of text segmentation strategies and LBM: independent splits $D$ into non-overlapping segments of length $T$; overlapping generates segments via a sliding window with a $T/2$ step size; LBM introduces inter-segment semantic propagation atop independent.
  • ...and 1 more figures