Uncertainty Estimation on Sequential Labeling via Uncertainty Transmission
Jianfeng He, Linlin Yu, Shuo Lei, Chang-Tien Lu, Feng Chen
TL;DR
This work tackles uncertainty estimation for sequential labeling in NER (UE-NER) by introducing the Sequential Labeling Posterior Network (SLPN), which transmits uncertainty across tokens via a revised self-attention mechanism. Building on evidential deep learning, SLPN uses token-level Dirichlet posteriors ${\boldsymbol{\beta}^{\text{post}}}$ and an uncertainty transmission term ${\boldsymbol{\beta}^{\text{trans}}}$ to form ${\boldsymbol{\beta}^{\text{agg}} = \boldsymbol{\beta}^{\text{post}} + \boldsymbol{\beta}^{\text{trans}}}$, from which Dirichlet parameters ${\boldsymbol{\alpha}^{\text{agg}} = \boldsymbol{\beta}^{\text{agg}} + \boldsymbol{\beta}^{\text{prior}}}$ drive token-level uncertainty estimates. An evaluation framework separates OOD detection and wrong-span (WS) entity detection, enabling a robust assessment of UE-NER performance; experiments on MIT-Restaurant, Mov-Sim, and Mov-Com show SLPN achieves strong improvements in weighted OOD/WS metrics and essential gains from the transmitted uncertainty component, though WS detection remains challenging. The approach yields practical benefits for safety-critical information extraction by improving OOD detection without sacrificing NER accuracy, while also highlighting areas for future work such as WS-focused improvements and broader model generalization.
Abstract
Sequential labeling is a task predicting labels for each token in a sequence, such as Named Entity Recognition (NER). NER tasks aim to extract entities and predict their labels given a text, which is important in information extraction. Although previous works have shown great progress in improving NER performance, uncertainty estimation on NER (UE-NER) is still underexplored but essential. This work focuses on UE-NER, which aims to estimate uncertainty scores for the NER predictions. Previous uncertainty estimation models often overlook two unique characteristics of NER: the connection between entities (i.e., one entity embedding is learned based on the other ones) and wrong span cases in the entity extraction subtask. Therefore, we propose a Sequential Labeling Posterior Network (SLPN) to estimate uncertainty scores for the extracted entities, considering uncertainty transmitted from other tokens. Moreover, we have defined an evaluation strategy to address the specificity of wrong-span cases. Our SLPN has achieved significant improvements on three datasets, such as a 5.54-point improvement in AUPR on the MIT-Restaurant dataset. Our code is available at \url{https://github.com/he159ok/UncSeqLabeling_SLPN}.
