Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts

Ha Manh Bui; Anqi Liu

Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts

Ha Manh Bui, Anqi Liu

TL;DR

Density-Regression is proposed, a method that leverages the density function in uncertainty estimation and achieves fast inference by a single forward pass and is distance aware on the feature space, which is a necessary condition for a neural network to produce high-quality uncertainty estimation under distribution shifts.

Abstract

Morden deep ensembles technique achieves strong uncertainty estimation performance by going through multiple forward passes with different models. This is at the price of a high storage space and a slow speed in the inference (test) time. To address this issue, we propose Density-Regression, a method that leverages the density function in uncertainty estimation and achieves fast inference by a single forward pass. We prove it is distance aware on the feature space, which is a necessary condition for a neural network to produce high-quality uncertainty estimation under distribution shifts. Empirically, we conduct experiments on regression tasks with the cubic toy dataset, benchmark UCI, weather forecast with time series, and depth estimation under real-world shifted applications. We show that Density-Regression has competitive uncertainty estimation performance under distribution shifts with modern deep regressors while using a lower model size and a faster inference speed.

Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts

TL;DR

Abstract

Paper Structure (31 sections, 6 theorems, 46 equations, 11 figures, 15 tables, 1 algorithm)

This paper contains 31 sections, 6 theorems, 46 equations, 11 figures, 15 tables, 1 algorithm.

Introduction
Background
Preliminaries
Evaluating Uncertainty
Test-time Efficiency
Density-Regression
Exponential Family Distribution
Training and Inference Process
Theoretical Analysis
Experiments
Toy Dataset
Time Series Weather Forecasting
Benchmark UCI
Monocular Depth Estimation
Test-time Efficiency Evaluation
...and 16 more sections

Key Result

Theorem 3.1

If the predictive distribution follows Eq. eq:exp and the sufficient statistic has the form $\Phi(z,y) = ,$ then Density-Regression has the conditional Gaussian distribution as follows $p(y|x;\theta) \sim \mathcal{N}(\mu(x,\theta),\sigma^2(x,\theta)),$ where where $z=f(x)$ and $\theta_{g}^\mu$ and $\theta_{g}^\sigma$ are the parameters (model weights) of the regressor $g$, i.e., $(\theta_{g}^\mu,

Figures (11)

Figure 1: Predictive distributions for the toy dataset $y=x^3 + \epsilon$, $\epsilon \sim \mathcal{N}(0,3^2)$. The gray dots in the area between two vertical dashed lines represent observations in training, the red dashed line represents the true data-generating function, and the blue line represents the mean predictions in which blue areas correspond to $\pm3$ standard deviation around the mean. Our Density-Regression achieves distance awareness and, therefore, can improve distribution calibration by confident & sharp predictions on IID training data and decreased certainty and sharpness when the OOD data is far from the training set. A quick demo is available at https://colab.research.google.com/drive/1p5gK-rOI4XYgg2zTVtbh-5Ky06PGlA09?usp=sharing.
Figure 2: The overall architecture of Density-Regression, including encoder $f$, regressor $g$, and density function $p(Z;\alpha)$. Solid rectangle boxes represent these functions. Dashed rectangle boxes represent function weights. Three training steps and inference process follow Alg. \ref{['alg:algorithm']}
Figure 3: Comparison between Deep Ensembles and our model regarding temperature in Celsius (normalized) for every hour on the same day. More details are in Apd. \ref{['apd:time_series']}.
Figure 4: Comparison in pixel-wise depth predictions and predictive uncertainty on (a) IID and (b) $0.04$ noise level on corrupted OOD dataset. Detailed figures for the robustness under corrupted noise are in Apd. \ref{['apd:depth']}; (c) Our model performance on the real-world OOD ApolloScape.
Figure 5: (a) Visualization of calibration error with reliability diagram on the real-world OOD ApolloScape; (b) Comparison in model storage requirement at test-time; (c) Inference cost comparison at test-time across three modern GPU architectures (detailed in Apd. \ref{['apd:implementation']}).
...and 6 more figures

Theorems & Definitions (15)

Definition 2.1
Definition 2.2
Theorem 3.1
Corollary 3.2
Remark 3.3
Lemma 3.4
Theorem 3.5
Theorem 3.6
Remark 3.7
Lemma A.1
...and 5 more

Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts

TL;DR

Abstract

Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (15)