Table of Contents
Fetching ...

EarthNets: Empowering AI in Earth Observation

Zhitong Xiong, Fahong Zhang, Yi Wang, Yilei Shi, Xiao Xiang Zhu

TL;DR

EarthNets addresses the lack of unified benchmarking for Earth observation deep learning by surveying 500+ public RS datasets and analyzing them along volume, modalities, resolutions, and inter-dataset correlations. It introduces a dataset-attribute based ranking to build a unified five-dataset benchmark spanning image classification, object detection, and semantic segmentation, and releases the EarthNets platform to enable fair, reproducible evaluation and cross-domain model development. Empirical benchmarking on the proposed platform shows transformer-based architectures often outperform CNN baselines on large-scale RS data, while local regions may benefit from CNNs, underscoring the value of hybrid approaches. The work delivers a practical, open framework that facilitates reproducible RS DL research and strengthens ties between the computer vision and remote sensing communities.

Abstract

Earth observation (EO), aiming at monitoring the state of planet Earth using remote sensing data, is critical for improving our daily lives and living environment. With a growing number of satellites in orbit, an increasing number of datasets with diverse sensors and research domains are being published to facilitate the research of the remote sensing community. This paper presents a comprehensive review of more than 500 publicly published datasets, including research domains like agriculture, land use and land cover, disaster monitoring, scene understanding, vision-language models, foundation models, climate change, and weather forecasting. We systematically analyze these EO datasets from four aspects: volume, resolution distributions, research domains, and the correlation between datasets. Based on the dataset attributes, we propose to measure, rank, and select datasets to build a new benchmark for model evaluation. Furthermore, a new platform for EO, termed EarthNets, is released to achieve a fair and consistent evaluation of deep learning methods on remote sensing data. EarthNets supports standard dataset libraries and cutting-edge deep learning models to bridge the gap between the remote sensing and machine learning communities. Based on this platform, extensive deep-learning methods are evaluated on the new benchmark. The insightful results are beneficial to future research. The platform and dataset collections are publicly available at https://earthnets.github.io.

EarthNets: Empowering AI in Earth Observation

TL;DR

EarthNets addresses the lack of unified benchmarking for Earth observation deep learning by surveying 500+ public RS datasets and analyzing them along volume, modalities, resolutions, and inter-dataset correlations. It introduces a dataset-attribute based ranking to build a unified five-dataset benchmark spanning image classification, object detection, and semantic segmentation, and releases the EarthNets platform to enable fair, reproducible evaluation and cross-domain model development. Empirical benchmarking on the proposed platform shows transformer-based architectures often outperform CNN baselines on large-scale RS data, while local regions may benefit from CNNs, underscoring the value of hybrid approaches. The work delivers a practical, open framework that facilitates reproducible RS DL research and strengthens ties between the computer vision and remote sensing communities.

Abstract

Earth observation (EO), aiming at monitoring the state of planet Earth using remote sensing data, is critical for improving our daily lives and living environment. With a growing number of satellites in orbit, an increasing number of datasets with diverse sensors and research domains are being published to facilitate the research of the remote sensing community. This paper presents a comprehensive review of more than 500 publicly published datasets, including research domains like agriculture, land use and land cover, disaster monitoring, scene understanding, vision-language models, foundation models, climate change, and weather forecasting. We systematically analyze these EO datasets from four aspects: volume, resolution distributions, research domains, and the correlation between datasets. Based on the dataset attributes, we propose to measure, rank, and select datasets to build a new benchmark for model evaluation. Furthermore, a new platform for EO, termed EarthNets, is released to achieve a fair and consistent evaluation of deep learning methods on remote sensing data. EarthNets supports standard dataset libraries and cutting-edge deep learning models to bridge the gap between the remote sensing and machine learning communities. Based on this platform, extensive deep-learning methods are evaluated on the new benchmark. The insightful results are beneficial to future research. The platform and dataset collections are publicly available at https://earthnets.github.io.
Paper Structure (24 sections, 1 equation, 10 figures, 14 tables)

This paper contains 24 sections, 1 equation, 10 figures, 14 tables.

Figures (10)

  • Figure 1: Chronological overview of the volumes (in logarithmic scale) of over 500 existing datasets. As can be seen, increasingly numerous and larger datasets have been constructed and published over time. (Best viewed by zooming in.)
  • Figure 2: Data modalities (outer perimeter) organized by EO tasks (inner labels). Although a wide range of data sources are used for EO, optical data (RGB) remains the most commonly used modality for the majority of RS tasks.
  • Figure 3: Research Domains (outer perimeter) Organized by EO Tasks (inner labels). It can be seen that there are strong correlations between the research domains and EO tasks.
  • Figure 4: Visualization of the relationships between data resolution and the number of annotated classes. An interesting finding is that most datasets have a resolution that is either smaller than 1m or larger than 10m. Datasets with a resolution range of between 1 to 10m are scarce.
  • Figure 5: Visualization of the correlation between different datasets. A lighter color means a higher correlation.
  • ...and 5 more figures