Just rotate it! Uncertainty estimation in closed-source models via multiple queries
Konstantinos Pitas, Julyan Arbel
TL;DR
The paper tackles uncertainty estimation for closed-source image classifiers that do not expose post-softmax distributions by using multiple queries on transformed inputs to estimate p_A(x) = P_{T(x)}( f(T(x)) = A ). It derives a Gaussian latent-noise model linking p_A(x) to p(A|x,f) and demonstrates that natural transformations such as rotations yield better calibration than Gaussian perturbations, achieving notable gains in ECE and AUROC on CIFAR-10/100 and ImageNet. A transfer-learning approach learns an empirical latent-noise distribution F_n from open-source data and applies it to closed-source models via p(A|x,f) = 1/(1+exp(a F_n^{-1}(1-p_A(x)))) to further improve calibration, sometimes matching or surpassing the best natural transformations. The work provides a practical framework for obtaining calibrated uncertainty estimates from opaque models and emphasizes the importance of aligning input perturbations with latent-space noise for reliable uncertainty quantification.
Abstract
We propose a simple and effective method to estimate the uncertainty of closed-source deep neural network image classification models. Given a base image, our method creates multiple transformed versions and uses them to query the top-1 prediction of the closed-source model. We demonstrate significant improvements in the calibration of uncertainty estimates compared to the naive baseline of assigning 100\% confidence to all predictions. While we initially explore Gaussian perturbations, our empirical findings indicate that natural transformations, such as rotations and elastic deformations, yield even better-calibrated predictions. Furthermore, through empirical results and a straightforward theoretical analysis, we elucidate the reasons behind the superior performance of natural transformations over Gaussian noise. Leveraging these insights, we propose a transfer learning approach that further improves our calibration results.
