Aerial Image Classification in Scarce and Unconstrained Environments via Conformal Prediction
Farhad Pourkamali-Anaraki
TL;DR
The paper addresses uncertainty quantification for aerial image classification in scarce, unconstrained environments by applying conformal prediction to generate coverage-guaranteed prediction sets at level $1-\alpha$. It systematically compares three nonconformity scores (LAC, APS, RAPS) and calibration pipelines with/without temperature scaling, using three pretrained backbones (MobileNetV2, DenseNet-121, ResNet-152) on the ERA dataset with a seven-class event taxonomy. Key findings show that LAC yields the smallest, most informative sets but can exhibit variable empirical coverage, while APS and RAPS provide robust coverage at the cost of larger sets; temperature scaling has a nuanced, model-dependent impact, sometimes shrinking and other times enlarging prediction sets, particularly for ResNet. The work demonstrates practical uncertainty quantification in data-scarce aerial contexts and points to model-reduction and noisy-label considerations as promising directions for real-time, resource-constrained deployments.
Abstract
This paper presents a comprehensive empirical analysis of conformal prediction methods on a challenging aerial image dataset featuring diverse events in unconstrained environments. Conformal prediction is a powerful post-hoc technique that takes the output of any classifier and transforms it into a set of likely labels, providing a statistical guarantee on the coverage of the true label. Unlike evaluations on standard benchmarks, our study addresses the complexities of data-scarce and highly variable real-world settings. We investigate the effectiveness of leveraging pretrained models (MobileNet, DenseNet, and ResNet), fine-tuned with limited labeled data, to generate informative prediction sets. To further evaluate the impact of calibration, we consider two parallel pipelines (with and without temperature scaling) and assess performance using two key metrics: empirical coverage and average prediction set size. This setup allows us to systematically examine how calibration choices influence the trade-off between reliability and efficiency. Our findings demonstrate that even with relatively small labeled samples and simple nonconformity scores, conformal prediction can yield valuable uncertainty estimates for complex tasks. Moreover, our analysis reveals that while temperature scaling is often employed for calibration, it does not consistently lead to smaller prediction sets, underscoring the importance of careful consideration in its application. Furthermore, our results highlight the significant potential of model compression techniques within the conformal prediction pipeline for deployment in resource-constrained environments. Based on our observations, we advocate for future research to delve into the impact of noisy or ambiguous labels on conformal prediction performance and to explore effective model reduction strategies.
