Neural Embedding Compression For Efficient Multi-Task Earth Observation Modelling
Carlos Gomes, Thomas Brunschwiler
TL;DR
NEC targets the high transfer costs of large EO data by exchanging compressed embeddings instead of raw data and adapts foundation models through a lightweight neural compression pipeline. The method optimizes a rate-distortion objective, $\min \lambda R + D$, with $R \approx -\mathbb{E}[\log_2 p(y)]$ and $D$ given by MAE-based distortion, while updating only about 10% of the FM parameters. Empirically, NEC delivers substantial data reductions (roughly 75–90% in some regimes) with minimal performance loss on scene classification and semantic segmentation across two EO tasks. The approach shows data-efficient, scalable multi-task EO modelling with results competitive to JPEG 2000, and the authors provide open-source code to enable broader adoption for sustainable EO data workflows.
Abstract
As repositories of large scale data in earth observation (EO) have grown, so have transfer and storage costs for model training and inference, expending significant resources. We introduce Neural Embedding Compression (NEC), based on the transfer of compressed embeddings to data consumers instead of raw data. We adapt foundation models (FM) through learned neural compression to generate multi-task embeddings while navigating the tradeoff between compression rate and embedding utility. We update only a small fraction of the FM parameters (10%) for a short training period (1% of the iterations of pre-training). We evaluate NEC on two EO tasks: scene classification and semantic segmentation. Compared with applying traditional compression to the raw data, NEC achieves similar accuracy with a 75% to 90% reduction in data. Even at 99.7% compression, performance drops by only 5% on the scene classification task. Overall, NEC is a data-efficient yet performant approach for multi-task EO modelling.
