Table of Contents
Fetching ...

Torch-Uncertainty: A Deep Learning Framework for Uncertainty Quantification

Adrien Lafage, Olivier Laurent, Firas Gabetni, Gianni Franchi

TL;DR

Torch-Uncertainty introduces a unified, domain-general framework for uncertainty quantification in deep learning, built on PyTorch and Lightning. It emphasizes modular UQ method integration, extensive evaluation metrics, and a broad, plug-and-play dataset suite to enable reproducible cross-domain benchmarking. The paper showcases classification and segmentation benchmarks, demonstrates state-of-the-art performance for ensembles and calibration techniques, and highlights the framework’s potential to accelerate robust, uncertainty-aware AI deployment. By providing pretrained models, tutorials, and open-source tooling, it aims to democratize access to principled UQ research and practice across academia and industry.

Abstract

Deep Neural Networks (DNNs) have demonstrated remarkable performance across various domains, including computer vision and natural language processing. However, they often struggle to accurately quantify the uncertainty of their predictions, limiting their broader adoption in critical real-world applications. Uncertainty Quantification (UQ) for Deep Learning seeks to address this challenge by providing methods to improve the reliability of uncertainty estimates. Although numerous techniques have been proposed, a unified tool offering a seamless workflow to evaluate and integrate these methods remains lacking. To bridge this gap, we introduce Torch-Uncertainty, a PyTorch and Lightning-based framework designed to streamline DNN training and evaluation with UQ techniques and metrics. In this paper, we outline the foundational principles of our library and present comprehensive experimental results that benchmark a diverse set of UQ methods across classification, segmentation, and regression tasks. Our library is available at https://github.com/ENSTA-U2IS-AI/Torch-Uncertainty

Torch-Uncertainty: A Deep Learning Framework for Uncertainty Quantification

TL;DR

Torch-Uncertainty introduces a unified, domain-general framework for uncertainty quantification in deep learning, built on PyTorch and Lightning. It emphasizes modular UQ method integration, extensive evaluation metrics, and a broad, plug-and-play dataset suite to enable reproducible cross-domain benchmarking. The paper showcases classification and segmentation benchmarks, demonstrates state-of-the-art performance for ensembles and calibration techniques, and highlights the framework’s potential to accelerate robust, uncertainty-aware AI deployment. By providing pretrained models, tutorials, and open-source tooling, it aims to democratize access to principled UQ research and practice across academia and industry.

Abstract

Deep Neural Networks (DNNs) have demonstrated remarkable performance across various domains, including computer vision and natural language processing. However, they often struggle to accurately quantify the uncertainty of their predictions, limiting their broader adoption in critical real-world applications. Uncertainty Quantification (UQ) for Deep Learning seeks to address this challenge by providing methods to improve the reliability of uncertainty estimates. Although numerous techniques have been proposed, a unified tool offering a seamless workflow to evaluate and integrate these methods remains lacking. To bridge this gap, we introduce Torch-Uncertainty, a PyTorch and Lightning-based framework designed to streamline DNN training and evaluation with UQ techniques and metrics. In this paper, we outline the foundational principles of our library and present comprehensive experimental results that benchmark a diverse set of UQ methods across classification, segmentation, and regression tasks. Our library is available at https://github.com/ENSTA-U2IS-AI/Torch-Uncertainty

Paper Structure

This paper contains 46 sections, 1 equation, 4 figures, 11 tables.

Figures (4)

  • Figure 1: A suggestion of overview of the many dimensions of robustness and uncertainty quantification in deep learning. In Torch-Uncertainty, we focus on the "rational" in-distribution predictions, distribution-shift robustness and the capacity to detect out-of-distribution samples.
  • Figure 2: Overview of Torch-Uncertainty's usage for model training and evaluation. Post-hoc methods are optional but can improve performance when practitioners can access enough data. UQ and TU stand for uncertainty quantification and Torch-Uncertainty, respectively.
  • Figure 3: Best checkpoint positions according to validation metrics. The model is a UNet optimized on MUAD's semantic segmentation dataset.
  • Figure 4: Example of a prediction visualization available in Torch-Uncertainty. The model is a DeepLabV3+ trained on MUAD-Small for 20 epochs.