Table of Contents
Fetching ...

HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition

Sijin Sun, Ming Deng, Xinrui Yu, Liangbin Zhao

TL;DR

This work targets boundary detection and long-range dependency modeling in Chinese Named Entity Recognition (CNER) by introducing HREB-CRF, a Hierarchical Reduced-bias EMA framework with CRF. The approach leverages a RoBERTa-based backbone, a Hierarchical EMA module to jointly model short-range and long-range cues, and a reduced-bias design to stabilize deep training, followed by Bi-LSTM and CRF decoding. Key contributions include the integration of Reduced-biased Hierarchical EMA (RHEMA) into NER, a dynamic residual (Reduced-bias) mechanism to improve gradient flow, and empirical validation showing state-of-the-art results on MSRA, Resume, and Weibo with significant F1 gains (e.g., +1.1% on MSRA, +1.6% on Resume, +9.8% on Weibo). The results demonstrate robust boundary detection and context modeling across datasets, with careful ablations confirming the importance of EMA, reduced-bias modules, and proper hyperparameter settings, offering practical improvements for Chinese NER systems.

Abstract

Incorrect boundary division, complex semantic representation, and differences in pronunciation and meaning often lead to errors in Chinese Named Entity Recognition(CNER). To address these issues, this paper proposes HREB-CRF framework: Hierarchical Reduced-bias EMA with CRF. The proposed method amplifies word boundaries and pools long text gradients through exponentially fixed-bias weighted average of local and global hierarchical attention. Experimental results on the MSRA, Resume, and Weibo datasets show excellent in F1, outperforming the baseline model by 1.1\%, 1.6\%, and 9.8\%. The significant improvement in F1 shows evidences of strong effectiveness and robustness of approach in CNER tasks.

HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition

TL;DR

This work targets boundary detection and long-range dependency modeling in Chinese Named Entity Recognition (CNER) by introducing HREB-CRF, a Hierarchical Reduced-bias EMA framework with CRF. The approach leverages a RoBERTa-based backbone, a Hierarchical EMA module to jointly model short-range and long-range cues, and a reduced-bias design to stabilize deep training, followed by Bi-LSTM and CRF decoding. Key contributions include the integration of Reduced-biased Hierarchical EMA (RHEMA) into NER, a dynamic residual (Reduced-bias) mechanism to improve gradient flow, and empirical validation showing state-of-the-art results on MSRA, Resume, and Weibo with significant F1 gains (e.g., +1.1% on MSRA, +1.6% on Resume, +9.8% on Weibo). The results demonstrate robust boundary detection and context modeling across datasets, with careful ablations confirming the importance of EMA, reduced-bias modules, and proper hyperparameter settings, offering practical improvements for Chinese NER systems.

Abstract

Incorrect boundary division, complex semantic representation, and differences in pronunciation and meaning often lead to errors in Chinese Named Entity Recognition(CNER). To address these issues, this paper proposes HREB-CRF framework: Hierarchical Reduced-bias EMA with CRF. The proposed method amplifies word boundaries and pools long text gradients through exponentially fixed-bias weighted average of local and global hierarchical attention. Experimental results on the MSRA, Resume, and Weibo datasets show excellent in F1, outperforming the baseline model by 1.1\%, 1.6\%, and 9.8\%. The significant improvement in F1 shows evidences of strong effectiveness and robustness of approach in CNER tasks.

Paper Structure

This paper contains 25 sections, 20 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overall Structure of proposed model. The left figure shows the end-to-end model architecture from text input to entity classification output, and the right figure shows the data processing flow of the RHEMA architecture.
  • Figure 2: Proposed RHEMA architecture with added reduced-biased module. The figure illustrates the architecture of the RHEMA framework for text processing, enhanced by the addition of a reduced-bias module, marked in red. The core RHEMA block includes sequential processing through Batch Normalization and a FeedForward network. A novel "Reduced-Bias Add" module, shown with red arrows, is integrated to mitigate potential bias in the output, while preserving a feedback loop to earlier stages.
  • Figure 3: The architecture of single-head attention.
  • Figure 4: Based on the web search results, a piece of news was randomly selected as a test sample, which contains the extraction of names of people, places and organizations.
  • Figure 5: Cases in the MSRA validation set are selected for analysis.
  • ...and 1 more figures