Table of Contents
Fetching ...

Bayesian Active Learning By Distribution Disagreement

Thorben Werner, Lars Schmidt-Thieme

TL;DR

The paper tackles active learning for regression with uncertainty quantification, focusing on normalizing flows that yield a full predictive distribution $p(y|x)$. It introduces BALSA, a suite of BALD-based methods that operate directly on predictive distributions via distribution disagreement, including Grid and Pair sampling with KL-Divergence and Earth Mover's Distance metrics. An extensive evaluation across four real-world datasets and two architectures shows BALSA variants, particularly BALSA$^{KL}$ with Pair sampling, achieving state-of-the-art performance and outperforming standard heuristics and clustering baselines. The work provides a practical, reproducible framework for uncertainty-aware AL in regression with NF models and clarifies how to separate epistemic from aleatoric uncertainty in pool-based settings.

Abstract

Active Learning (AL) for regression has been systematically under-researched due to the increased difficulty of measuring uncertainty in regression models. Since normalizing flows offer a full predictive distribution instead of a point forecast, they facilitate direct usage of known heuristics for AL like Entropy or Least-Confident sampling. However, we show that most of these heuristics do not work well for normalizing flows in pool-based AL and we need more sophisticated algorithms to distinguish between aleatoric and epistemic uncertainty. In this work we propose BALSA, an adaptation of the BALD algorithm, tailored for regression with normalizing flows. With this work we extend current research on uncertainty quantification with normalizing flows \cite{berry2023normalizing, berry2023escaping} to real world data and pool-based AL with multiple acquisition functions and query sizes. We report SOTA results for BALSA across 4 different datasets and 2 different architectures.

Bayesian Active Learning By Distribution Disagreement

TL;DR

The paper tackles active learning for regression with uncertainty quantification, focusing on normalizing flows that yield a full predictive distribution . It introduces BALSA, a suite of BALD-based methods that operate directly on predictive distributions via distribution disagreement, including Grid and Pair sampling with KL-Divergence and Earth Mover's Distance metrics. An extensive evaluation across four real-world datasets and two architectures shows BALSA variants, particularly BALSA with Pair sampling, achieving state-of-the-art performance and outperforming standard heuristics and clustering baselines. The work provides a practical, reproducible framework for uncertainty-aware AL in regression with NF models and clarifies how to separate epistemic from aleatoric uncertainty in pool-based settings.

Abstract

Active Learning (AL) for regression has been systematically under-researched due to the increased difficulty of measuring uncertainty in regression models. Since normalizing flows offer a full predictive distribution instead of a point forecast, they facilitate direct usage of known heuristics for AL like Entropy or Least-Confident sampling. However, we show that most of these heuristics do not work well for normalizing flows in pool-based AL and we need more sophisticated algorithms to distinguish between aleatoric and epistemic uncertainty. In this work we propose BALSA, an adaptation of the BALD algorithm, tailored for regression with normalizing flows. With this work we extend current research on uncertainty quantification with normalizing flows \cite{berry2023normalizing, berry2023escaping} to real world data and pool-based AL with multiple acquisition functions and query sizes. We report SOTA results for BALSA across 4 different datasets and 2 different architectures.
Paper Structure (27 sections, 14 equations, 7 figures, 4 tables)

This paper contains 27 sections, 14 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Overview of our regression models. Both models use an MLP encoder to create a latent embedding $z$ of the input, before using $z$ to parametrize a predictive distribution.
  • Figure 2: Critical Difference Diagram for all datasets and query size 1. (lower is better) Horizontal bars indicate statistical significance according to the Wilcoxon-Holm test.
  • Figure 3: AL trajectories of all tested algorithms in the Diamonds dataset. Curves based on NLL (left) and MAE (right); lower is better. Trajectories are averaged over 30 restarts of each experiment.
  • Figure 4: Critical Difference Diagrams with ranks computed based on MAE instead of NLL. Same experimental parameters as Fig. \ref{['fig:cd_diagrams']}
  • Figure 5: Comparison of "dual" evaluation mode for both BALSA algorithms as well as the re-normalized version of $BALSA^\text{KL Grid}$. Based on NLL and $\tau = 1$
  • ...and 2 more figures