EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision
Diego Velazquez, Pau Rodriguez López, Sergio Alonso, Josep M. Gonfaus, Jordi Gonzalez, Gerardo Richarte, Javier Marin, Yoshua Bengio, Alexandre Lacoste
TL;DR
EarthView addresses the need for scalable, unlabeled data in remote sensing by integrating Satellogic, Sentinel, and NEON imagery into a 15-terapixel dataset spanning 2017–2022. The authors introduce EarthMAE, a time- and source-aware masked autoencoder designed to learn from heterogeneous multi-sensor data with diverse masking strategies and temporal encodings. Key findings show that pre-training with Satellogic data, especially when combined with Sentinel data, yields consistent downstream gains, and that incorporating temporality and specialized masking strategies is crucial for performance. The dataset and model together create an open, scalable platform to study self-supervised learning for Earth monitoring, enabling broader access and paving the way for larger, more capable foundation models in Earth observation.
Abstract
This paper presents EarthView, a comprehensive dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks. The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic. Our dataset provides a wide spectrum of image data with varying resolutions, harnessed from different sensors and organized coherently into an accessible HuggingFace dataset in parquet format. This data spans five years, from 2017 to 2022. Accompanying the dataset, we introduce EarthMAE, a tailored Masked Autoencoder, developed to tackle the distinct challenges of remote sensing data. Trained in a self-supervised fashion, EarthMAE effectively processes different data modalities such as hyperspectral, multispectral, topographical data, segmentation maps, and temporal structure. This model helps us show that pre-training on Satellogic data improves performance on downstream tasks. While there is still a gap to fill in MAE for heterogeneous data, we regard this innovative combination of an expansive, diverse dataset and a versatile model adapted for self-supervised learning as a stride forward in deep learning for Earth monitoring.
