Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

Mengting Hu; Zhen Zhang; Shiwan Zhao; Minlie Huang; Bingzhe Wu

Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

Mengting Hu, Zhen Zhang, Shiwan Zhao, Minlie Huang, Bingzhe Wu

TL;DR

This survey addresses uncertainty in NLP by first classifying its sources into input, system, and output, then detailing three main estimation approaches (calibration-based, sampling-based, distribution-based) and a set of evaluation metrics. It then surveys applications in data filtering, active learning, OOD detection, selective prediction, and efficiency/performance improvements, followed by a discussion of challenges posed by high-dimensional language spaces, variable-length generation, and ethical considerations. The authors argue for a holistic framework that combines theory, methods, and practical guidance to improve reliability and trustworthiness of NLP systems, especially for safety-critical applications. The work also highlights future directions for scalable uncertainty estimation in large pretrained language models and the need for clear uncertainty expression in natural language.

Abstract

As a main field of artificial intelligence, natural language processing (NLP) has achieved remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in a unified manner, with various tasks being associated with each other through sharing the same paradigm. However, neural networks are black boxes and rely on probability computation. Making mistakes is inevitable. Therefore, estimating the reliability and trustworthiness (in other words, uncertainty) of neural networks becomes a key research direction, which plays a crucial role in reducing models' risks and making better decisions. Therefore, in this survey, we provide a comprehensive review of uncertainty-relevant works in the NLP field. Considering the data and paradigms characteristics, we first categorize the sources of uncertainty in natural language into three types, including input, system, and output. Then, we systemically review uncertainty quantification approaches and the main applications. Finally, we discuss the challenges of uncertainty estimation in NLP and discuss potential future directions, taking into account recent trends in the field. Though there have been a few surveys about uncertainty estimation, our work is the first to review uncertainty from the NLP perspective.

Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

TL;DR

Abstract

Paper Structure (39 sections, 20 equations, 9 figures, 3 tables)

This paper contains 39 sections, 20 equations, 9 figures, 3 tables.

Introduction
Organization of The Survey
Uncertainty Sources
Theory Background of Uncertainty
Uncertainty Source from Input
Language Intrinsic
System Unknown Query
Uncertainty Source from System
Model Structure
Model Training
Uncertainty Source from Output
Classification Paradigm
Sequence Labeling Paradigm
Generation Paradigm
Regression Paradigm
...and 24 more sections

Figures (9)

Figure 1: An illustration of NLP systems applied in the medical domain.
Figure 2: Illustration of sources of uncertainty. The figure includes the sources of uncertainty in the interaction of the NLP system. We start from the three processes of Input, System, and Output to analyze the possible causes of each uncertainty. It is worth noting that these three parts are interrelated. As the query passes through the NLP system, due to the combination of neural networks and different task specifications in complex ways, the type of uncertainty in the output prediction becomes complicated, including both the aleatoric and epistemic uncertainty, which are hard to decompose. Thus, we refer to it as combined uncertainty.
Figure 3: Rationale visualization of three types of uncertainty modeling. For a given input sample $\mathbf{x}$, each method provides a prediction y whose uncertainty is quantified as ${u}$. (a) Calibration-based uncertainty representation, (b) Sampling-based uncertainty estimation method, (c) Distribution-based uncertainty estimation method. $\Xi$ denotes a specific distribution. The mean $E$ and variance $Var$ are only used to keep the visualization simple, there are other quantifications in practice.
Figure 4: An overview of the taxonomy of uncertainty estimation techniques. The inner circle represents uncertainty modeling methods, the outer circle represents actual uncertainty methods, and some methods are represented by abbreviations.
Figure 5: Illustration of uncertainty representation method based on calibration confidence. a) Calibration curve, the closer to the perfect curve, the better the confidence calibration. b) Reliability diagram combined with ECE calibration indicators.
...and 4 more figures

Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

TL;DR

Abstract

Uncertainty in Natural Language Processing: Sources, Quantification, and Applications

Authors

TL;DR

Abstract

Table of Contents

Figures (9)