Table of Contents
Fetching ...

Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition

Arthur Hoarau, Benjamin Quost, Sébastien Destercke, Willem Waegeman

TL;DR

The paper tackles uncertainty quantification in multi-modal AI by challenging the view that aleatoric uncertainty is irreducible. It introduces ALFA, a two-directional data acquisition protocol that reduces epistemic uncertainty through acquiring more labeled data and reduces aleatoric uncertainty by adding informative modalities, guided by thresholds on $EU$ and $AU$. The approach is instantiated with four uncertainty-quantification methods, including Deep EK-NN, and validated on Wine and BIOSCAN-5M datasets, demonstrating cost-efficient, reliable predictions and actionable decisions about when to collect more data or switch modalities. The work highlights the dataset-dependent nature of uncertainty disentanglement, supports monotonic behavior of $EU$ under information growth, and provides open-source tooling to facilitate adoption in uncertainty-aware, multi-modal settings.

Abstract

To generate accurate and reliable predictions, modern AI systems need to combine data from multiple modalities, such as text, images, audio, spreadsheets, and time series. Multi-modal data introduces new opportunities and challenges for disentangling uncertainty: it is commonly assumed in the machine learning community that epistemic uncertainty can be reduced by collecting more data, while aleatoric uncertainty is irreducible. However, this assumption is challenged in modern AI systems when information is obtained from different modalities. This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions, allowing sampling in two directions: sample size and data modality. The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases, while epistemic uncertainty decreases by collecting more observations. We provide proof-of-concept implementations on two multi-modal datasets to showcase our data acquisition framework, which combines ideas from active learning, active feature acquisition and uncertainty quantification.

Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition

TL;DR

The paper tackles uncertainty quantification in multi-modal AI by challenging the view that aleatoric uncertainty is irreducible. It introduces ALFA, a two-directional data acquisition protocol that reduces epistemic uncertainty through acquiring more labeled data and reduces aleatoric uncertainty by adding informative modalities, guided by thresholds on and . The approach is instantiated with four uncertainty-quantification methods, including Deep EK-NN, and validated on Wine and BIOSCAN-5M datasets, demonstrating cost-efficient, reliable predictions and actionable decisions about when to collect more data or switch modalities. The work highlights the dataset-dependent nature of uncertainty disentanglement, supports monotonic behavior of under information growth, and provides open-source tooling to facilitate adoption in uncertainty-aware, multi-modal settings.

Abstract

To generate accurate and reliable predictions, modern AI systems need to combine data from multiple modalities, such as text, images, audio, spreadsheets, and time series. Multi-modal data introduces new opportunities and challenges for disentangling uncertainty: it is commonly assumed in the machine learning community that epistemic uncertainty can be reduced by collecting more data, while aleatoric uncertainty is irreducible. However, this assumption is challenged in modern AI systems when information is obtained from different modalities. This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions, allowing sampling in two directions: sample size and data modality. The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases, while epistemic uncertainty decreases by collecting more observations. We provide proof-of-concept implementations on two multi-modal datasets to showcase our data acquisition framework, which combines ideas from active learning, active feature acquisition and uncertainty quantification.

Paper Structure

This paper contains 27 sections, 13 equations, 15 figures, 5 tables, 1 algorithm.

Figures (15)

  • Figure 1: Active learning with multi-modal feature acquisition illustrated on two test instances from the BIOSCAN-5M dataset. For the first insect, all three modalities are needed to decrease the aleatoric uncertainty to a satisfactory level, whereas for the second insect, one is sufficient certain after adding geographical information to the image modality.
  • Figure 2: Experiments on Wine and BIOSCAN-5M with uncertainty estimation through four different methods.
  • Figure 3: Global test performance (top) and number of robust predictions with test performance for robust predictions (bottom) for each model vs. size of the training set on Wine dataset with Deep EK-NN.
  • Figure 4: Global test performance (top) and robust predictions with performance on robust predictions (bottom) for each model vs. size of the training set on BIOSCAN-5M dataset with Deep EK-NN.
  • Figure 5: Aleatoric uncertainty vs. Epistemic uncertainty on CIFAR-10 with ResNet18 exhibiting a positive correlation ($\simeq 0.89$). This is a perfectly reasonable phenomenon associated with image datasets and not an issue of disentanglement. Ships are well-represented (low EU) and easy to classify (low AU), whereas cats that resemble deer (or foxes) are rare (high EU) and also difficult to distinguish (high AU).
  • ...and 10 more figures

Theorems & Definitions (2)

  • Example A.1
  • Example A.2