SenPa-MAE: Sensor Parameter Aware Masked Autoencoder for Multi-Satellite Self-Supervised Pretraining
Jonathan Prexl, Michael Schmitt
TL;DR
SenPa-MAE addresses the challenge of building a sensor-agnostic Earth observation foundation model by integrating sensor parameters directly into the embedding process. It extends masked autoencoding to multispectral imagery with a dedicated sensor-parameter encoding module for spectral response functions $\{\boldsymbol{\lambda}_c\}$ and ground sampling distances $\{\sigma_c\}$, and introduces Spectral Superposition Augmentation to diversify training data. The approach enables cross-sensor pretraining across Landsat, Sentinel-2, and Planet-SuperDove imagery, yielding more robust zero-shot and fine-tuned performance on multi-sensor land-cover tasks. This sensor-aware pretraining framework advances toward sensor-independent inference and cross-sensor fusion, with potential applicability to broader EO tasks and hyperspectral data.
Abstract
This paper introduces SenPa-MAE, a transformer architecture that encodes the sensor parameters of an observed multispectral signal into the image embeddings. SenPa-MAE can be pre-trained on imagery of different satellites with non-matching spectral or geometrical sensor characteristics. To incorporate sensor parameters, we propose a versatile sensor parameter encoding module as well as a data augmentation strategy for the diversification of the pre-training dataset. This enables the model to effectively differentiate between various sensors and gain an understanding of sensor parameters and the correlation to the observed signal. Given the rising number of Earth observation satellite missions and the diversity in their sensor specifications, our approach paves the way towards a sensor-independent Earth observation foundation model. This opens up possibilities such as cross-sensor training and sensor-independent inference.
