Table of Contents
Fetching ...

EIDSeg: A Pixel-Level Semantic Segmentation Dataset for Post-Earthquake Damage Assessment from Social Media Images

Huili Huang, Chengeng Liu, Danrong Zhang, Shail Patel, Anastasiya Masalava, Sagar Sadak, Parisa Babolhavaeji, WeiHong Low, Max Mahdi Roozbahani, J. David Frost

TL;DR

EIDSeg addresses the lack of pixel-level, ground-view labeled data for post-earthquake damage assessment by introducing a large-scale segmentation dataset drawn from social media across nine earthquakes. It implements a three-phase, cross-disciplinary annotation protocol enabling non-experts to produce high-quality pixel-level masks for five infrastructure classes, achieving inter-annotator agreement above 70%. Benchmark experiments across eight state-of-the-art models identify Encoder-only Mask Transformer (EoMT) as the top performer with mIoU around 80.8% and PA around 90.3%, demonstrating strong potential for fine-grained, rapid damage assessment from public imagery. The dataset, along with its rigorous labeling guidelines, lays a foundation for faster, more granular post-disaster analysis and cross-event generalization in real-world rescue and recovery efforts.

Abstract

Rapid post-earthquake damage assessment is crucial for rescue and resource planning. Still, existing remote sensing methods depend on costly aerial images, expert labeling, and produce only binary damage maps for early-stage evaluation. Although ground-level images from social networks provide a valuable source to fill this gap, a large pixel-level annotated dataset for this task is still unavailable. We introduce EIDSeg, the first large-scale semantic segmentation dataset specifically for post-earthquake social media imagery. The dataset comprises 3,266 images from nine major earthquakes (2008-2023), annotated across five classes of infrastructure damage: Undamaged Building, Damaged Building, Destroyed Building, Undamaged Road, and Damaged Road. We propose a practical three-phase cross-disciplinary annotation protocol with labeling guidelines that enables consistent segmentation by non-expert annotators, achieving over 70% inter-annotator agreement. We benchmark several state-of-the-art segmentation models, identifying Encoder-only Mask Transformer (EoMT) as the top-performing method with a Mean Intersection over Union (mIoU) of 80.8%. By unlocking social networks' rich ground-level perspective, our work paves the way for a faster, finer-grained damage assessment in the post-earthquake scenario.

EIDSeg: A Pixel-Level Semantic Segmentation Dataset for Post-Earthquake Damage Assessment from Social Media Images

TL;DR

EIDSeg addresses the lack of pixel-level, ground-view labeled data for post-earthquake damage assessment by introducing a large-scale segmentation dataset drawn from social media across nine earthquakes. It implements a three-phase, cross-disciplinary annotation protocol enabling non-experts to produce high-quality pixel-level masks for five infrastructure classes, achieving inter-annotator agreement above 70%. Benchmark experiments across eight state-of-the-art models identify Encoder-only Mask Transformer (EoMT) as the top performer with mIoU around 80.8% and PA around 90.3%, demonstrating strong potential for fine-grained, rapid damage assessment from public imagery. The dataset, along with its rigorous labeling guidelines, lays a foundation for faster, more granular post-disaster analysis and cross-event generalization in real-world rescue and recovery efforts.

Abstract

Rapid post-earthquake damage assessment is crucial for rescue and resource planning. Still, existing remote sensing methods depend on costly aerial images, expert labeling, and produce only binary damage maps for early-stage evaluation. Although ground-level images from social networks provide a valuable source to fill this gap, a large pixel-level annotated dataset for this task is still unavailable. We introduce EIDSeg, the first large-scale semantic segmentation dataset specifically for post-earthquake social media imagery. The dataset comprises 3,266 images from nine major earthquakes (2008-2023), annotated across five classes of infrastructure damage: Undamaged Building, Damaged Building, Destroyed Building, Undamaged Road, and Damaged Road. We propose a practical three-phase cross-disciplinary annotation protocol with labeling guidelines that enables consistent segmentation by non-expert annotators, achieving over 70% inter-annotator agreement. We benchmark several state-of-the-art segmentation models, identifying Encoder-only Mask Transformer (EoMT) as the top-performing method with a Mean Intersection over Union (mIoU) of 80.8%. By unlocking social networks' rich ground-level perspective, our work paves the way for a faster, finer-grained damage assessment in the post-earthquake scenario.

Paper Structure

This paper contains 20 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Annotation workflow of EIDSeg. (a): Dataset creation: images are collected and filtered from the EID and DSS datasets, then independently labeled by annotation groups (A–C). Experts evaluate label quality, and high-quality samples are curated into the final EIDSeg dataset. (b): Label evaluation: Annotations undergo experts review following an "Undesignated Checking" step, which address uncertain labels provided by annotators. If inter-annotator agreement (mIoU) exceeds a threshold $T$, the image proceeds to expert adjudication; otherwise, it is discarded to ensure dataset integrity.
  • Figure 2: Damage Mask examples of EIDSeg.
  • Figure 3: Weekly segmentation agreement across annotators. Left: Inter-annotator mIoU (%) among Groups A, B, and C. Right: Average annotator–expert mIoU (%) within each group over time. Note that some annotators completed the assigned tasks earlier, which explains the different end dates in the figure (Group A in Week 8, and Group B in Weeks 7–8).
  • Figure 4: Pixel distribution across segmentation classes in the EIDSeg dataset. Each bar shows the total pixel count per class, along with the percentage it contributes to the full dataset. Class colors are chosen to reflect semantic segmentation label: Undamaged Building, Damaged Building, Debris; and Undamaged Road, Damaged Road.
  • Figure 5: Examples of annotator misalignment observed during the annotation process. Annotation A and Annotation B show labels produced by two annotators within the same group. The failure reasons are summarized in the rightmost column, with additional details provided in Appendix A. Class colors correspond to the semantic segmentation labels: Undamaged Building, Damaged Building, Debris, Undamaged Road, Damaged Road, and the Undesignated category.
  • ...and 2 more figures