Table of Contents
Fetching ...

Uncertainty Regularized Evidential Regression

Kai Ye, Tiejin Chen, Hua Wei, Liang Zhan

TL;DR

The paper identifies a fundamental weakness in Evidential Regression Networks (ERN): activation constraints that enforce non-negativity can create High Uncertainty Areas (HUA) where gradients vanish and learning stalls. It introduces an uncertainty-regularization term $\mathcal{L}^{U}$ that preserves nonzero gradients within HUA, enabling ERN (and MERN) to learn from the full training set; the approach is validated across cubic regression and monocular depth-estimation tasks, demonstrating improved uncertainty estimation, calibration, and occasionally predictive accuracy, including in out-of-distribution scenarios. The authors extend the regularization to multivariate evidential models with NIW priors, offering a general mechanism to mitigate zero-gradient issues across evidential regression variants. Overall, the work advances the theory and practice of evidential uncertainty learning by providing a robust regularization that enhances learning in previously intractable regions and improves practical performance on real-world tasks.

Abstract

The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory to predict a target and quantify the associated uncertainty. Guided by the underlying theory, specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance by limiting its ability to learn from all samples. This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it. Initially, we define the region where the models can't effectively learn from the samples. Following this, we thoroughly analyze the ERN and investigate this constraint. Leveraging the insights from our analysis, we address the limitation by introducing a novel regularization term that empowers the ERN to learn from the whole training set. Our extensive experiments substantiate our theoretical findings and demonstrate the effectiveness of the proposed solution.

Uncertainty Regularized Evidential Regression

TL;DR

The paper identifies a fundamental weakness in Evidential Regression Networks (ERN): activation constraints that enforce non-negativity can create High Uncertainty Areas (HUA) where gradients vanish and learning stalls. It introduces an uncertainty-regularization term that preserves nonzero gradients within HUA, enabling ERN (and MERN) to learn from the full training set; the approach is validated across cubic regression and monocular depth-estimation tasks, demonstrating improved uncertainty estimation, calibration, and occasionally predictive accuracy, including in out-of-distribution scenarios. The authors extend the regularization to multivariate evidential models with NIW priors, offering a general mechanism to mitigate zero-gradient issues across evidential regression variants. Overall, the work advances the theory and practice of evidential uncertainty learning by providing a robust regularization that enhances learning in previously intractable regions and improves practical performance on real-world tasks.

Abstract

The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory to predict a target and quantify the associated uncertainty. Guided by the underlying theory, specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance by limiting its ability to learn from all samples. This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it. Initially, we define the region where the models can't effectively learn from the samples. Following this, we thoroughly analyze the ERN and investigate this constraint. Leveraging the insights from our analysis, we address the limitation by introducing a novel regularization term that empowers the ERN to learn from the whole training set. Our extensive experiments substantiate our theoretical findings and demonstrate the effectiveness of the proposed solution.
Paper Structure (38 sections, 4 theorems, 48 equations, 9 figures, 1 table)

This paper contains 38 sections, 4 theorems, 48 equations, 9 figures, 1 table.

Key Result

Theorem 1

ERN cannot learn from samples in high uncertainty area.

Figures (9)

  • Figure 1: An overview of the Evidential Regression Network (ERN) architecture with illustrations on the final distributions of the prediction. ERN outputs four predictions as distribution parameters, with activation functions like Relu or Softplus to constrain the output to meet the requirements of distribution parameters.
  • Figure 2: $\mathcal{L}^{\mathrm{ERN}}$ in Equation \ref{['eq:ern-loss']} cannot help the model get out of high uncertainty area while our proposed $\mathcal{L}^{\mathrm{U}}$ can still learn from samples in the grey area.
  • Figure 3: Uncertainty estimation on Cubic Regression. The blue shade represents prediction uncertainty. An effective evidential model would cause the blue shade to cover the distance between the predicted value and the ground truth precisely. Up: Comparison of model performance within HUA. Down: Comparison of model performance outside HUA. UR-ERN can cover the ground truth precisely under both within HUA and outside HUA.
  • Figure 4: Uncertainty prediction of Depth Estimation within HUA. (a) The blue shade represents prediction uncertainty. A good estimation of uncertainty should cover the gap between prediction and ground truth exactly. (b) Root Mean Square Error (RMSE) at various confidence levels. The evidential model with a larger confidence level should have a lower RMSE. (c) Uncertainty calibration calculated following kuleshov2018accurate, the ideal curve is $y=x$. The calibration errors are 0.2261, 0.2250, and 0.0243 for ERN, NLL-ERN and UR-ERN, respectively.
  • Figure 5: Uncertainty prediction of Depth Estimation outside HUA. (a) RMSE at various confidence levels. (b) Uncertainty calibration (ideal: $y=x$). The calibration errors are 0.1366, 0.1978, and 0.0289 for ERN, NLL-ERN and UR-ERN, respectively. (c) and (d) show OOD experimental results. (c) Entropy comparisons for different methods. (d) Density histograms of entropy. Entropy is calculated from $\sigma$, directly related to uncertainty. A good evidential model should be able to distinguish OOD data.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Definition 1: High Uncertainty Area
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • proof