Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification

Hansang Lee; Haeil Lee; Helen Hong; Junmo Kim

Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification

Hansang Lee, Haeil Lee, Helen Hong, Junmo Kim

TL;DR

This work tackles uncertainty estimation in deep learning image classification by introducing test-time mixup augmentation (TTMA). It defines TTMA-DU, an entropy-based measure computed from mixup-perturbed test predictions to better separate correct from incorrect results, and TTMA-CSU, a class-specific uncertainty that reveals class confusion and similarity in the latent space. Through experiments on ISIC-18 and CIFAR-100, TTMA-DU consistently outperforms traditional TTA and MCDO in distinguishing correct versus incorrect predictions, while TTMA-CSU provides novel insights into class relationships beyond what average feature distance captures. The approach offers a practical, interpretable framework for more trustworthy predictions and can inform model improvement, with limitations in computational cost and future avenues for selective sampling strategies.

Abstract

Uncertainty estimation of trained deep learning networks is valuable for optimizing learning efficiency and evaluating the reliability of network predictions. In this paper, we propose a method for estimating uncertainty in deep learning image classification using test-time mixup augmentation (TTMA). To improve the ability to distinguish correct and incorrect predictions in existing aleatoric uncertainty, we introduce TTMA data uncertainty (TTMA-DU) by applying mixup augmentation to test data and measuring the entropy of the predicted label histogram. In addition to TTMA-DU, we propose TTMA class-specific uncertainty (TTMA-CSU), which captures aleatoric uncertainty specific to individual classes and provides insight into class confusion and class similarity within the trained network. We validate our proposed methods on the ISIC-18 skin lesion diagnosis dataset and the CIFAR-100 real-world image classification dataset. Our experiments show that (1) TTMA-DU more effectively differentiates correct and incorrect predictions compared to existing uncertainty measures due to mixup perturbation, and (2) TTMA-CSU provides information on class confusion and class similarity for both datasets.

Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification

TL;DR

Abstract

Paper Structure (24 sections, 13 equations, 14 figures, 4 tables)

This paper contains 24 sections, 13 equations, 14 figures, 4 tables.

Introduction
Methods
TTMA data uncertainty
Test data augmentation with mixup
Mixup label prediction and test label inference
TTMA data uncertainty estimation
TTMA class-specific uncertainty
Test data augmentation with mixup
Mixup label prediction and test label inference
TTMA class-specific uncertainty estimation
Interpretation of TTMA class-specific uncertainty
Experiments
Datasets
Implementation details
Experiments on TTMA Data Uncertainty
...and 9 more sections

Figures (14)

Figure 1: A process of the proposed TTMA data uncertainty (TTMA-DU) estimation method.
Figure 2: A process of TTMA class-specific (TTMA-CSU) uncertainty estimation method.
Figure 3: Our hypothesis on TTMA class-specific uncertainty (TTMA-CSU) and average feature distance (AFD) according to class dissimilarity of two classes in class confusion and class similarity scenarios.
Figure 4: Examples of ISIC-18 skin lesion images for different disease clases: (a) AKIEC, (b) BCC, (c) BKL, (d) DF, (e) MEL, (f) NV, (g) VASC.
Figure 5: Histograms of aleatoric uncertainty for correct and incorrect test data for ISIC-18 classification results with (a) TTA, (b) MCDO, and (c) TTMA-DU methods. In (a) TTA, the number of correct samples having uncertainty of [0,0.1) is 126. In (b) MCDO, the number of correct samples having uncertainty of [0,0.1) is 151.
...and 9 more figures

Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification

TL;DR

Abstract

Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (14)