Multiple Distribution Shift -- Aerial (MDS-A): A Dataset for Test-Time Error Detection and Model Adaptation
Noel Ngu, Aditya Taparia, Gerardo I. Simari, Mario Leiva, Jack Corcoran, Ransalu Senanayake, Paulo Shakarian, Nathaniel D. Bastian
TL;DR
This work tackles robust aerial object detection under distribution shifts caused by weather by introducing MDS-A, a dataset composed of six weather-specific training sets and multiple challenging test sets generated with the AirSim simulator. It evaluates six DeTR-based baselines trained on single-weather conditions, alongside error-detection-rule (EDR) baselines built via DetRuleLearn, highlighting performance degradation under out-of-distribution conditions and the potential of EDR to improve precision at modest recall cost. The dataset includes rich metadata and distribution-difference metrics such as Fréchet Inception Distance, enabling controlled studies of OOD behavior and facilitating test-time adaptation and domain-generalization research. Overall, MDS-A provides a practical benchmark and baseline tools to advance weather-robust aerial perception and encourages exploration of ensembles and meta-learning for improved OOD resilience.
Abstract
Machine learning models assume that training and test samples are drawn from the same distribution. As such, significant differences between training and test distributions often lead to degradations in performance. We introduce Multiple Distribution Shift -- Aerial (MDS-A) -- a collection of inter-related datasets of the same aerial domain that are perturbed in different ways to better characterize the effects of out-of-distribution performance. Specifically, MDS-A is a set of simulated aerial datasets collected under different weather conditions. We include six datasets under different simulated weather conditions along with six baseline object-detection models, as well as several test datasets that are a mix of weather conditions that we show have significant differences from the training data. In this paper, we present characterizations of MDS-A, provide performance results for the baseline machine learning models (on both their specific training datasets and the test data), as well as results of the baselines after employing recent knowledge-engineering error-detection techniques (EDR) thought to improve out-of-distribution performance. The dataset is available at https://lab-v2.github.io/mdsa-dataset-website.
