Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

Wen Wu; Chao Zhang; Philip C. Woodland

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

Wen Wu, Chao Zhang, Philip C. Woodland

TL;DR

A Bayesian approach, deep evidential emotion regression (DEER), to estimate the uncertainty in emotion attributes, which enables a joint estimation of emotion attributes along with the aleatoric and epistemic uncertainties.

Abstract

In automatic emotion recognition (AER), labels assigned by different human annotators to the same utterance are often inconsistent due to the inherent complexity of emotion and the subjectivity of perception. Though deterministic labels generated by averaging or voting are often used as the ground truth, it ignores the intrinsic uncertainty revealed by the inconsistent labels. This paper proposes a Bayesian approach, deep evidential emotion regression (DEER), to estimate the uncertainty in emotion attributes. Treating the emotion attribute labels of an utterance as samples drawn from an unknown Gaussian distribution, DEER places an utterance-specific normal-inverse gamma prior over the Gaussian likelihood and predicts its hyper-parameters using a deep neural network model. It enables a joint estimation of emotion attributes along with the aleatoric and epistemic uncertainties. AER experiments on the widely used MSP-Podcast and IEMOCAP datasets showed DEER produced state-of-the-art results for both the mean values and the distribution of emotion attributes.

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

TL;DR

Abstract

Paper Structure (25 sections, 18 equations, 4 figures, 5 tables)

This paper contains 25 sections, 18 equations, 4 figures, 5 tables.

Introduction
Related Work
Deep Evidential Emotion Regression
Problem setup
Training
Maximising the data fit
Calibrating the uncertainty on errors
Summary and implementation details
Experimental Setup
Dataset
Model structure
Evaluation metrics
Mean prediction
Uncertainty estimation
Experiments and Results
...and 10 more sections

Figures (4)

Figure 1: Illustration of the model structure. Weights $w_1, \ldots, w_{12}$ for the weighted sum of the 12 Transformer encoder outputs are trainable and satisfy $\sum_{i=1}^{12} w_i =1$.
Figure 2: Visualisation of (a) aleatoric (b) epistemic (c) total uncertainty of dominance for MSP-Podcast. $x$-asix is the test utterance index.
Figure 3: Reject Option of RMSE based on predicted variance for (a) MSP-Podcast and (b) IEMOCAP.
Figure 4: Model structure for bi-modal experiments.

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

TL;DR

Abstract

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

Authors

TL;DR

Abstract

Table of Contents

Figures (4)