Table of Contents
Fetching ...

Unlocking the Use of Raw Multispectral Earth Observation Imagery for Onboard Artificial Intelligence

Gabriele Meoni, Roberto Del Prete, Federico Serva, Alix De Beussche, Olivier Colin, Nicolas Longépé

TL;DR

This work presents a novel methodology to automate the creation of datasets for the detection of target events or objects from Sentinel-2 raw data and other multispectral EO pushbroom raw imagery, and applies the proposed methodology to realize thermal hotspots in raw Sentinel-2 data (THRawS), the first dataset of Sentinel-2 raw data containing warm thermal hotspots.

Abstract

Nowadays, there is growing interest in applying Artificial Intelligence (AI) on board Earth Observation (EO) satellites for time-critical applications, such as natural disaster response. However, the unavailability of raw satellite data currently hinders research on lightweight pre-processing techniques and limits the exploration of end-to-end pipelines, which could offer more efficient and accurate extraction of insights directly from the source data. To fill this gap, this work presents a novel methodology to automate the creation of datasets for the detection of target events (e.g., warm thermal hotspots) or objects (e.g., vessels) from Sentinel-2 raw data and other multispectral EO pushbroom raw imagery. The presented approach first processes the raw data by applying a pipeline consisting of spatial band registration and georeferencing of the raw data pixels. Then, it detects the target events by leveraging event-specific state-of-the-art algorithms on the Level-1C products, which are mosaicked and cropped on the georeferenced correspondent raw granule area. The detected events are finally re-projected back onto the corresponding raw images. We apply the proposed methodology to realize THRawS (Thermal Hotspots in Raw Sentinel-2 data), the first dataset of Sentinel-2 raw data containing warm thermal hotspots. THRawS includes 1090 samples containing wildfires, volcanic eruptions, and 33,335 event-free acquisitions to enable thermal hotspot detection and general classification applications. This dataset and associated toolkits provide the community with both an immediately useful resource as well as a framework and methodology acting as a template for future additions. With this work, we hope to pave the way for research on energy-efficient pre-processing algorithms and AI-based end-to-end processing systems on board EO satellites.

Unlocking the Use of Raw Multispectral Earth Observation Imagery for Onboard Artificial Intelligence

TL;DR

This work presents a novel methodology to automate the creation of datasets for the detection of target events or objects from Sentinel-2 raw data and other multispectral EO pushbroom raw imagery, and applies the proposed methodology to realize thermal hotspots in raw Sentinel-2 data (THRawS), the first dataset of Sentinel-2 raw data containing warm thermal hotspots.

Abstract

Nowadays, there is growing interest in applying Artificial Intelligence (AI) on board Earth Observation (EO) satellites for time-critical applications, such as natural disaster response. However, the unavailability of raw satellite data currently hinders research on lightweight pre-processing techniques and limits the exploration of end-to-end pipelines, which could offer more efficient and accurate extraction of insights directly from the source data. To fill this gap, this work presents a novel methodology to automate the creation of datasets for the detection of target events (e.g., warm thermal hotspots) or objects (e.g., vessels) from Sentinel-2 raw data and other multispectral EO pushbroom raw imagery. The presented approach first processes the raw data by applying a pipeline consisting of spatial band registration and georeferencing of the raw data pixels. Then, it detects the target events by leveraging event-specific state-of-the-art algorithms on the Level-1C products, which are mosaicked and cropped on the georeferenced correspondent raw granule area. The detected events are finally re-projected back onto the corresponding raw images. We apply the proposed methodology to realize THRawS (Thermal Hotspots in Raw Sentinel-2 data), the first dataset of Sentinel-2 raw data containing warm thermal hotspots. THRawS includes 1090 samples containing wildfires, volcanic eruptions, and 33,335 event-free acquisitions to enable thermal hotspot detection and general classification applications. This dataset and associated toolkits provide the community with both an immediately useful resource as well as a framework and methodology acting as a template for future additions. With this work, we hope to pave the way for research on energy-efficient pre-processing algorithms and AI-based end-to-end processing systems on board EO satellites.
Paper Structure (22 sections, 9 equations, 7 figures, 5 tables)

This paper contains 22 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustration of the processing chain from satellite data to L1C data. The images in the raw and L1C format show an eruption of the Etna volcano, Italy, for the two processing levels as RGB-like images.
  • Figure 2: Overview of the dataset creation methodology consisting of three main steps: 1) procurement of the list of events by visual inspections of other existing databases, 2) data download, 3) filtering of useful granules by using state-of-the-art algorithms designed for data, which are mosaicked and cropped on the areas of the raw granules of interest.
  • Figure 3: Example showcasing the downloaded granules for a specific event of the THRawS dataset. The polygon used for data retrieval (light blue) intercepts several granules marked in red, green, and white. The volcanic eruption is included only on the granules whose boundaries are marked in green and red. On the right, a zoomed view shows the band $B_{8A}$ (the first of the collection) of the green and red granules in yellow and pink. Since the band $B_{8A}$ of the green granule only includes the volcanic eruptions, the green granule is the only "useful granule". On the contrary, despite the red polygon partially surrounds the event of interest, this happens for other bands than $B_{8A}$. Because of that, the event is "non-useful" for our definition of the band collection $B_S 0 = [B_{8A},B_{11}, B_{12}]$.
  • Figure 4: Pictorial view of a granule with Prior and Afterwards coordinates
  • Figure 5: Each row shows all the processing steps for the bands $B_{8A}, B_{11}, B_{12}$ of the various raw data granules: the first image from left of each row displays the raw data granule, the second image shows a spatially registered granule, the third image shows the correspondent tiles cropped on the coordinates of first band of the raw data granule band collection with the detected bounding boxes, and the rightmost image showcases the bounding boxes warped on the raw data granules.
  • ...and 2 more figures