Lightweight, Pre-trained Transformers for Remote Sensing Timeseries
Gabriel Tseng, Ruben Cartuyvels, Ivan Zvonkov, Mirali Purohit, David Rolnick, Hannah Kerner
TL;DR
Presto addresses the challenge of limited labeled data in remote sensing by using a lightweight, self-supervised Transformer tailored to pixel-timeseries from multiple sensors. Through masked autoencoding and structured masking over 12-month pixel-timeseries with 15 dynamic channels and static metadata, Presto learns transferable representations that perform well across diverse tasks with far less compute than larger models. The approach yields strong results in timeseries, image, and image-timeseries settings, with ablations confirming the benefits of structured masking, pretraining, and scalable model size. This work demonstrates practical deployment potential for global-scale remote sensing pipelines, offering transfer learning and efficient feature extraction for practitioners with limited resources.
Abstract
Machine learning methods for satellite data have a range of societally relevant applications, but labels used to train models can be difficult or impossible to acquire. Self-supervision is a natural solution in settings with limited labeled data, but current self-supervised models for satellite data fail to take advantage of the characteristics of that data, including the temporal dimension (which is critical for many applications, such as monitoring crop growth) and availability of data from many complementary sensors (which can significantly improve a model's predictive performance). We present Presto (the Pretrained Remote Sensing Transformer), a model pre-trained on remote sensing pixel-timeseries data. By designing Presto specifically for remote sensing data, we can create a significantly smaller but performant model. Presto excels at a wide variety of globally distributed remote sensing tasks and performs competitively with much larger models while requiring far less compute. Presto can be used for transfer learning or as a feature extractor for simple models, enabling efficient deployment at scale.
