Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Wataru Hashimoto; Hidetaka Kamigaito; Taro Watanabe

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe

TL;DR

This investigation in NER found that data augmentation improves calibration and uncertainty in cross-genre and cross-lingual setting, especially in-domain setting, and showed that the calibration for NER tends to be more effective when the perplexity of the sentences generated by data augmentation is lower, and that increasing the size of the augmentation further improves calibration and uncertainty.

Abstract

This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is essential to achieve accurate predictions with calibrated confidence when applying Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs), as a real-world application. However, DNNs are prone to miscalibration, which limits their applicability. Moreover, existing methods for calibration and uncertainty estimation are computational expensive. Our investigation in NER found that data augmentation improves calibration and uncertainty in cross-genre and cross-lingual setting, especially in-domain setting. Furthermore, we showed that the calibration for NER tends to be more effective when the perplexity of the sentences generated by data augmentation is lower, and that increasing the size of the augmentation further improves calibration and uncertainty.

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

TL;DR

Abstract

Paper Structure (47 sections, 5 equations, 4 figures, 15 tables)

This paper contains 47 sections, 5 equations, 4 figures, 15 tables.

Introduction
Related Work
Named Entity Recognition
Uncertainty Estimation
Data Augmentation
Methods
Existing Calibration Methods
Baseline
Temperature Scaling (TS)
Label Smoothing (LS)
Monte-Carlo Dropout (MC Dropout)
Data Augmentation Methods for NER
Label-wise Token Replacement (LwTR)
Mention Replacement (MR)
Synonym Replacement (SR)
...and 32 more sections

Figures (4)

Figure 1: t-SNE plot of token embeddings of OntoNotes 5.0 $\mathtt{bn}$ training set (red), generated data by MELM (blue), source domain test set (green) and OntoNotes 5.0 $\mathtt{wb}$ test set (purple), respectively.
Figure 2: Average values of evaluation metrics for each data augmentation method in ID settings.
Figure 3: Average values of evaluation metrics for each data augmentation method in OOD settings.
Figure 4: t-SNE plot of token embeddings of MultiCoNER $\mathtt{EN}$ training set (red), generated data by SR (blue), source domain test set (green) and MultiCoNER $\mathtt{HI}$ test set (purple), respectively.

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

TL;DR

Abstract

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Authors

TL;DR

Abstract

Table of Contents

Figures (4)