TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis
Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, Jovana Knezevic, Silja Sormunen, Robin Young, Madeline C. Lisaius, Markus Immitzer, Toby Jackson, James Ball, David A. Coomes, Anil Madhavapeddy, Andrew Blake, Srinivasan Keshav
TL;DR
TESSERA addresses irregular Earth Observation time series by learning pixel-wise, multi-modal embeddings with temporal-sampling invariance using a dual-SAR/optical encoder and a large BT-based projector. It introduces the d-pixel temporal representation, global shuffling, and mix-up regularization, producing 128-D embeddings that are quantized to $8$-bit and released globally as $10$ m, annual maps with an Open GeoTessera library. Across six downstream benchmarks for classification, segmentation, and regression, TESSERA achieves state-of-the-art accuracy with high label efficiency, often needing only lightweight heads and minimal computation. This Embeddings-as-Data approach democratizes access to high-performance EO features, enabling large-scale retrieval and inference with practical tools while maintaining strong performance under cloudiness and data sparsity.
Abstract
Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient embeddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. We employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. We find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. To democratize access, adhere to FAIR principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. The model training/inference code, downstream task code, and pre-generated embeddings can be accessed at https://github.com/ucam-eo
