Table of Contents
Fetching ...

Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

Maohao Shen, J. Jon Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, Gregory W. Wornell

TL;DR

This investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.

Abstract

This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.

Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

TL;DR

This investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.

Abstract

This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.
Paper Structure (45 sections, 4 theorems, 38 equations, 14 figures, 4 tables)

This paper contains 45 sections, 4 theorems, 38 equations, 14 figures, 4 tables.

Key Result

Theorem 4.1

Let $p(\boldsymbol{\pi})=\mathsf{Dir}(\boldsymbol{\pi};\mathds{1}_C)$.

Figures (14)

  • Figure 1: Behavior of Uncertainties Learned by EDL methods on Real Data. (a) EDL methods learn spurious epistemic uncertainty, wherein uncertainty does not vanish with an increasing number of observed samples, contrary to the fundamental definition of epistemic uncertainty. (b) Instead of a constant, EDL methods learn model-dependent aleatoric uncertainty that depends on hyper-parameter $\lambda$, contrary to the fundamental definition of aleatoric uncertainty. Similar behavior holds for 2D Gaussian data (see Figure \ref{['fig:inconsistency_gauss']} in Appendix \ref{['app:subsec:Gaussian']}).
  • Figure 2: OOD Detection Performance v.s. Hyper-parameter $\lambda$ on CIFAR10. The $x$-axis represents the increasing $\lambda$ value, and the y-axis represents the Average AUROC score of OOD detection tasks. EDL Methods' uncertainty quantification performance are sensitive to hyper-parameter $\lambda$, while generally benefit from small $\lambda$.
  • Figure 3: Comparison of Different EDL Methods on OOD Detection. Distillation based methods, including new proposed Bootstrap-Distill method, demonstrate clear advantage over other classical EDL methods. Similar behavior holds for selective classification task.
  • Figure 4: Comparison of Different EDL Methods on Selective Classification. Distillation based methods, including new proposed Bootstrap-Distill method, demonstrate clear advantage over other classical EDL methods.
  • Figure 5: Behavior of Uncertainties Learned by EDL methods on Toy Data. (a) EDL methods learn spurious epistemic uncertainty, wherein uncertainty does not vanish with an increasing number of observed samples, contrary to the fundamental definition of epistemic uncertainty. (b) Instead of a constant, EDL methods learn model-dependent aleatoric uncertainty that depends on hyper-parameter $\lambda$, contrary to the fundamental definition of aleatoric uncertainty.
  • ...and 9 more figures

Theorems & Definitions (7)

  • Theorem 4.1: Unifying EDL Objectives for Classification
  • Theorem 5.1
  • Example 5.2: Categorical likelihood
  • Lemma D.1
  • proof
  • Theorem E.1
  • proof