Table of Contents
Fetching ...

Uncertainty Voting Ensemble for Imbalanced Deep Regression

Yuchang Jiang, Vivien Sainte Fare Garnot, Konrad Schindler, Jan Dirk Wegner

TL;DR

This work replaces traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty, and uses the predicted aleatoric uncertainty values to fuse the predictions of different expert models in the ensemble, eliminating the need for a separate aggregation module.

Abstract

Data imbalance is ubiquitous when applying machine learning to real-world problems, particularly regression problems. If training data are imbalanced, the learning is dominated by the densely covered regions of the target distribution and the learned regressor tends to exhibit poor performance in sparsely covered regions. Beyond standard measures like oversampling or reweighting, there are two main approaches to handling learning from imbalanced data. For regression, recent work leverages the continuity of the distribution, while for classification, the trend has been to use ensemble methods, allowing some members to specialize in predictions for sparser regions. In our method, named UVOTE, we integrate recent advances in probabilistic deep learning with an ensemble approach for imbalanced regression. We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty. Our experiments show that this loss function handles imbalance better. Additionally, we use the predicted aleatoric uncertainty values to fuse the predictions of different expert models in the ensemble, eliminating the need for a separate aggregation module. We compare our method with existing alternatives on multiple public benchmarks and show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates. Our code is available at https://github.com/SherryJYC/UVOTE.

Uncertainty Voting Ensemble for Imbalanced Deep Regression

TL;DR

This work replaces traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty, and uses the predicted aleatoric uncertainty values to fuse the predictions of different expert models in the ensemble, eliminating the need for a separate aggregation module.

Abstract

Data imbalance is ubiquitous when applying machine learning to real-world problems, particularly regression problems. If training data are imbalanced, the learning is dominated by the densely covered regions of the target distribution and the learned regressor tends to exhibit poor performance in sparsely covered regions. Beyond standard measures like oversampling or reweighting, there are two main approaches to handling learning from imbalanced data. For regression, recent work leverages the continuity of the distribution, while for classification, the trend has been to use ensemble methods, allowing some members to specialize in predictions for sparser regions. In our method, named UVOTE, we integrate recent advances in probabilistic deep learning with an ensemble approach for imbalanced regression. We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty. Our experiments show that this loss function handles imbalance better. Additionally, we use the predicted aleatoric uncertainty values to fuse the predictions of different expert models in the ensemble, eliminating the need for a separate aggregation module. We compare our method with existing alternatives on multiple public benchmarks and show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates. Our code is available at https://github.com/SherryJYC/UVOTE.
Paper Structure (41 sections, 8 equations, 8 figures, 7 tables)

This paper contains 41 sections, 8 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Overview of UVOTE. A shared backbone encodes the input $x$ into a representation $z$. A mixture of $M$ different experts uses this shared representation to make their predictions. Each expert predicts a regression value $\hat{y}$ as well as the uncertainty $\hat{s}$ of that prediction. At inference time, we use the prediction of the most certain expert $m_0$.
  • Figure 2: Dataset overview. Distribution of the training set of the four datasets. We consider very different tasks ranging from age regression, to text similarity prediction, and wind speed estimation.
  • Figure 3: Per-expert and aggregated MAE on IMDB-WIKI. Larger coverage of the triangle indicates better performance. The uncertainty-based aggregation of UVOTE nearly matches the performance of the best expert on each subset of the test data.
  • Figure 4: Distribution of expert selection on AgeDB. Notably, Expert 1, which specializes in minority samples, is predominantly chosen for labels in the few-shot regions.
  • Figure 5: Predictions of ours vs. vanilla model on AgeDB. For clarity we plot the average prediction of each bin.
  • ...and 3 more figures