Probabilistic Machine Learning for Noisy Labels in Earth Observation
Spyros Kondylatos, Nikolaos Ioannis Bountos, Ioannis Prapas, Angelos Zavras, Gustau Camps-Valls, Ioannis Papoutsis
TL;DR
This study tackles label noise in Earth Observation by adopting a probabilistic ML framework that models input-dependent label noise and yields aleatoric uncertainty estimates. The approach places a noise-augmented latent term in the logits and uses MC sampling with a tempered softmax to produce predictions and per-sample uncertainties, with the network learning both mean and noise terms. Across four EO applications (LULC, landslides, volcanic activity, and wildfires), the probabilistic models generally improve performance and provide reliable uncertainty footprints, validated via Discard Tests and uncertainty-density analyses. The findings argue for uncertainty-aware EO ML as a route to more trustworthy, interpretable, and decision-supportive remote sensing systems, while noting the current focus on aleatoric uncertainty and plans to extend to epistemic uncertainty in future work.
Abstract
Label noise poses a significant challenge in Earth Observation (EO), often degrading the performance and reliability of supervised Machine Learning (ML) models. Yet, given the critical nature of several EO applications, developing robust and trustworthy ML solutions is essential. In this study, we take a step in this direction by leveraging probabilistic ML to model input-dependent label noise and quantify data uncertainty in EO tasks, accounting for the unique noise sources inherent in the domain. We train uncertainty-aware probabilistic models across a broad range of high-impact EO applications-spanning diverse noise sources, input modalities, and ML configurations-and introduce a dedicated pipeline to assess their accuracy and reliability. Our experimental results show that the uncertainty-aware models consistently outperform the standard deterministic approaches across most datasets and evaluation metrics. Moreover, through rigorous uncertainty evaluation, we validate the reliability of the predicted uncertainty estimates, enhancing the interpretability of model predictions. Our findings emphasize the importance of modeling label noise and incorporating uncertainty quantification in EO, paving the way for more accurate, reliable, and trustworthy ML solutions in the field.
