Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification
Hansang Lee, Haeil Lee, Helen Hong, Junmo Kim
TL;DR
This work tackles uncertainty estimation in deep learning image classification by introducing test-time mixup augmentation (TTMA). It defines TTMA-DU, an entropy-based measure computed from mixup-perturbed test predictions to better separate correct from incorrect results, and TTMA-CSU, a class-specific uncertainty that reveals class confusion and similarity in the latent space. Through experiments on ISIC-18 and CIFAR-100, TTMA-DU consistently outperforms traditional TTA and MCDO in distinguishing correct versus incorrect predictions, while TTMA-CSU provides novel insights into class relationships beyond what average feature distance captures. The approach offers a practical, interpretable framework for more trustworthy predictions and can inform model improvement, with limitations in computational cost and future avenues for selective sampling strategies.
Abstract
Uncertainty estimation of trained deep learning networks is valuable for optimizing learning efficiency and evaluating the reliability of network predictions. In this paper, we propose a method for estimating uncertainty in deep learning image classification using test-time mixup augmentation (TTMA). To improve the ability to distinguish correct and incorrect predictions in existing aleatoric uncertainty, we introduce TTMA data uncertainty (TTMA-DU) by applying mixup augmentation to test data and measuring the entropy of the predicted label histogram. In addition to TTMA-DU, we propose TTMA class-specific uncertainty (TTMA-CSU), which captures aleatoric uncertainty specific to individual classes and provides insight into class confusion and class similarity within the trained network. We validate our proposed methods on the ISIC-18 skin lesion diagnosis dataset and the CIFAR-100 real-world image classification dataset. Our experiments show that (1) TTMA-DU more effectively differentiates correct and incorrect predictions compared to existing uncertainty measures due to mixup perturbation, and (2) TTMA-CSU provides information on class confusion and class similarity for both datasets.
