Table of Contents
Fetching ...

UrbanSARFloods: Sentinel-1 SLC-Based Benchmark Dataset for Urban and Open-Area Flood Mapping

Jie Zhao, Zhitong Xiong, Xiao Xiang Zhu

TL;DR

UrbanSARFloods delivers a georeferenced benchmark dataset for large-scale flood mapping that jointly covers open and urban areas using Sentinel-1 SLC data, including both intensity and InSAR coherence across pre/post-event states. With 8,879 chips at $512\times512$ pixels ($20\,m$) and $807{,}500\ km^2$ of coverage over 18 events across 5 continents, the dataset enables robust benchmarking of semantic segmentation models on urban and open-flood scenarios; labeling combines semi-automatic techniques and targeted high-resolution hand annotations. The authors evaluate nine state-of-the-art models and study transfer learning, finding substantial challenges due to data imbalance and domain mismatch between SAR-based inputs and ImageNet pretraining, particularly for the urban flood class. They propose that advances in imbalance handling and domain-adaptive learning, along with expanding the event set, are essential to improve large-scale SAR-based urban flood mapping, and they provide an open resource to catalyze progress in this area.

Abstract

Due to its cloud-penetrating capability and independence from solar illumination, satellite Synthetic Aperture Radar (SAR) is the preferred data source for large-scale flood mapping, providing global coverage and including various land cover classes. However, most studies on large-scale SAR-derived flood mapping using deep learning algorithms have primarily focused on flooded open areas, utilizing available open-access datasets (e.g., Sen1Floods11) and with limited attention to urban floods. To address this gap, we introduce \textbf{UrbanSARFloods}, a floodwater dataset featuring pre-processed Sentinel-1 intensity data and interferometric coherence imagery acquired before and during flood events. It contains 8,879 $512\times 512$ chips covering 807,500 $km^2$ across 20 land cover classes and 5 continents, spanning 18 flood events. We used UrbanSARFloods to benchmark existing state-of-the-art convolutional neural networks (CNNs) for segmenting open and urban flood areas. Our findings indicate that prevalent approaches, including the Weighted Cross-Entropy (WCE) loss and the application of transfer learning with pretrained models, fall short in overcoming the obstacles posed by imbalanced data and the constraints of a small training dataset. Urban flood detection remains challenging. Future research should explore strategies for addressing imbalanced data challenges and investigate transfer learning's potential for SAR-based large-scale flood mapping. Besides, expanding this dataset to include additional flood events holds promise for enhancing its utility and contributing to advancements in flood mapping techniques.

UrbanSARFloods: Sentinel-1 SLC-Based Benchmark Dataset for Urban and Open-Area Flood Mapping

TL;DR

UrbanSARFloods delivers a georeferenced benchmark dataset for large-scale flood mapping that jointly covers open and urban areas using Sentinel-1 SLC data, including both intensity and InSAR coherence across pre/post-event states. With 8,879 chips at pixels () and of coverage over 18 events across 5 continents, the dataset enables robust benchmarking of semantic segmentation models on urban and open-flood scenarios; labeling combines semi-automatic techniques and targeted high-resolution hand annotations. The authors evaluate nine state-of-the-art models and study transfer learning, finding substantial challenges due to data imbalance and domain mismatch between SAR-based inputs and ImageNet pretraining, particularly for the urban flood class. They propose that advances in imbalance handling and domain-adaptive learning, along with expanding the event set, are essential to improve large-scale SAR-based urban flood mapping, and they provide an open resource to catalyze progress in this area.

Abstract

Due to its cloud-penetrating capability and independence from solar illumination, satellite Synthetic Aperture Radar (SAR) is the preferred data source for large-scale flood mapping, providing global coverage and including various land cover classes. However, most studies on large-scale SAR-derived flood mapping using deep learning algorithms have primarily focused on flooded open areas, utilizing available open-access datasets (e.g., Sen1Floods11) and with limited attention to urban floods. To address this gap, we introduce \textbf{UrbanSARFloods}, a floodwater dataset featuring pre-processed Sentinel-1 intensity data and interferometric coherence imagery acquired before and during flood events. It contains 8,879 chips covering 807,500 across 20 land cover classes and 5 continents, spanning 18 flood events. We used UrbanSARFloods to benchmark existing state-of-the-art convolutional neural networks (CNNs) for segmenting open and urban flood areas. Our findings indicate that prevalent approaches, including the Weighted Cross-Entropy (WCE) loss and the application of transfer learning with pretrained models, fall short in overcoming the obstacles posed by imbalanced data and the constraints of a small training dataset. Urban flood detection remains challenging. Future research should explore strategies for addressing imbalanced data challenges and investigate transfer learning's potential for SAR-based large-scale flood mapping. Besides, expanding this dataset to include additional flood events holds promise for enhancing its utility and contributing to advancements in flood mapping techniques.
Paper Structure (16 sections, 4 figures, 6 tables)

This paper contains 16 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of the UrbanSARFloods dataset.
  • Figure 2: Statistics of label distribution of UrbanSARFloods.
  • Figure 3: Statistics of semi-automatic label distribution of UrbanSARFloods.
  • Figure 4: Example of one test site (Weihui): flood label data, SAR data, and generated flood maps using different models trained from scratch.