Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning
Esther Rolf, Konstantin Klemmer, Caleb Robinson, Hannah Kerner
TL;DR
This position paper argues that satellite data form a distinct ML modality, not adequately served by lift-and-shift approaches borrowed from natural images or text. It outlines the unique characteristics of SatML—logarithmic spatial/temporal scales, diverse spectral channels, massive volumes, and sparse annotations—and highlights deployment, evaluation, and ethical challenges that demand specialized methods. The authors advocate for SatML-specific learning strategies, architectures, and explicit domain-context modeling, and discuss how SatML can enrich broader ML research through distribution shift, SSL, multi-modal learning, and new positional encodings. They further call for community coordination, benchmarks tied to real-world impact, and governance to ensure global and local benefits. Together, these points aim to elevate SatML from an application area to a standalone, responsible, and impactful research discipline.
Abstract
Satellite data has the potential to inspire a seismic shift for machine learning -- one in which we rethink existing practices designed for traditional data modalities. As machine learning for satellite data (SatML) gains traction for its real-world impact, our field is at a crossroads. We can either continue applying ill-suited approaches, or we can initiate a new research agenda that centers around the unique characteristics and challenges of satellite data. This position paper argues that satellite data constitutes a distinct modality for machine learning research and that we must recognize it as such to advance the quality and impact of SatML research across theory, methods, and deployment. We outline critical discussion questions and actionable suggestions to transform SatML from merely an intriguing application area to a dedicated research discipline that helps move the needle on big challenges for machine learning and society.
