Table of Contents
Fetching ...

In-domain representation learning for remote sensing

Maxim Neumann, Andre Susano Pinto, Xiaohua Zhai, Neil Houlsby

TL;DR

Remote sensing has lagged in representation learning; the authors provide five standardized datasets and a common evaluation protocol to study in-domain RS representations. They train RS-specific representations via supervised fine-tuning on in-domain data and show this approach yields state-of-the-art transfer performance across unseen RS tasks, especially with limited downstream data. Their results reveal that multi-resolution and dataset quality factors (label accuracy, class diversity) influence the quality of learned representations, and that large weakly labeled RS datasets do not always outperform smaller, curated datasets. They also release the trained representations and code via TensorFlow Hub/TFDS to enable rapid reuse.

Abstract

Given the importance of remote sensing, surprisingly little attention has been paid to it by the representation learning community. To address it and to establish baselines and a common evaluation protocol in this domain, we provide simplified access to 5 diverse remote sensing datasets in a standardized form. Specifically, we investigate in-domain representation learning to develop generic remote sensing representations and explore which characteristics are important for a dataset to be a good source for remote sensing representation learning. The established baselines achieve state-of-the-art performance on these datasets.

In-domain representation learning for remote sensing

TL;DR

Remote sensing has lagged in representation learning; the authors provide five standardized datasets and a common evaluation protocol to study in-domain RS representations. They train RS-specific representations via supervised fine-tuning on in-domain data and show this approach yields state-of-the-art transfer performance across unseen RS tasks, especially with limited downstream data. Their results reveal that multi-resolution and dataset quality factors (label accuracy, class diversity) influence the quality of learned representations, and that large weakly labeled RS datasets do not always outperform smaller, curated datasets. They also release the trained representations and code via TensorFlow Hub/TFDS to enable rapid reuse.

Abstract

Given the importance of remote sensing, surprisingly little attention has been paid to it by the representation learning community. To address it and to establish baselines and a common evaluation protocol in this domain, we provide simplified access to 5 diverse remote sensing datasets in a standardized form. Specifically, we investigate in-domain representation learning to develop generic remote sensing representations and explore which characteristics are important for a dataset to be a good source for remote sensing representation learning. The established baselines achieve state-of-the-art performance on these datasets.

Paper Structure

This paper contains 25 sections, 1 equation, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Aggregated mean relative improvement in logit accuracy of fine-tuning from ImageNet and in-domain representations in comparison to training from scratch.
  • Figure 2: Top-1 accuracy rate or mean average precision (mAP) on validation set after training with a given method over a limited number of training examples on each dataset.
  • Figure 3: BigEarthNet image examples. Some images might be affected by seasonal snow, clouds or cloud shadows, which is not reflected in the land cover labels of this dataset. Note that some of the images (for example samples 4, 6, 21) could be affected by seasonal snow and cloud coverage, which is not reflected in the labels.
  • Figure 4: BigEarthNet labels distribution counts.
  • Figure 5: EuroSAT image examples.
  • ...and 3 more figures