Table of Contents
Fetching ...

A simple, strong baseline for building damage detection on the xBD dataset

Sebastian Gerard, Paul Borne-Pons, Josephine Sullivan

TL;DR

Problem: building-damage detection on satellite imagery using the xBD dataset. Approach: derive a simple, strong baseline from the xView2 winner by stepwise simplification, and test under non-overlapping event splits. Key findings: the simplified baseline retains most performance (within ~2 percentage points of the reproduction) but both models exhibit strong generalization gaps for unseen disasters, especially for minor and major damage classes. Significance: the work delivers a practical, easier-to-use baseline and highlights dataset distribution as a major factor in generalization, with code and data loaders published for reproducibility.

Abstract

We construct a strong baseline method for building damage detection by starting with the highly-engineered winning solution of the xView2 competition, and gradually stripping away components. This way, we obtain a much simpler method, while retaining adequate performance. We expect the simplified solution to be more widely and easily applicable. This expectation is based on the reduced complexity, as well as the fact that we choose hyperparameters based on simple heuristics, that transfer to other datasets. We then re-arrange the xView2 dataset splits such that the test locations are not seen during training, contrary to the competition setup. In this setting, we find that both the complex and the simplified model fail to generalize to unseen locations. Analyzing the dataset indicates that this failure to generalize is not only a model-based problem, but that the difficulty might also be influenced by the unequal class distributions between events. Code, including the baseline model, is available under https://github.com/PaulBorneP/Xview2_Strong_Baseline

A simple, strong baseline for building damage detection on the xBD dataset

TL;DR

Problem: building-damage detection on satellite imagery using the xBD dataset. Approach: derive a simple, strong baseline from the xView2 winner by stepwise simplification, and test under non-overlapping event splits. Key findings: the simplified baseline retains most performance (within ~2 percentage points of the reproduction) but both models exhibit strong generalization gaps for unseen disasters, especially for minor and major damage classes. Significance: the work delivers a practical, easier-to-use baseline and highlights dataset distribution as a major factor in generalization, with code and data loaders published for reproducibility.

Abstract

We construct a strong baseline method for building damage detection by starting with the highly-engineered winning solution of the xView2 competition, and gradually stripping away components. This way, we obtain a much simpler method, while retaining adequate performance. We expect the simplified solution to be more widely and easily applicable. This expectation is based on the reduced complexity, as well as the fact that we choose hyperparameters based on simple heuristics, that transfer to other datasets. We then re-arrange the xView2 dataset splits such that the test locations are not seen during training, contrary to the competition setup. In this setting, we find that both the complex and the simplified model fail to generalize to unseen locations. Analyzing the dataset indicates that this failure to generalize is not only a model-based problem, but that the difficulty might also be influenced by the unequal class distributions between events. Code, including the baseline model, is available under https://github.com/PaulBorneP/Xview2_Strong_Baseline
Paper Structure (14 sections, 8 equations, 5 figures, 3 tables)

This paper contains 14 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Published weights, our reproduction, our simplified model, generalization. We compare the following models: The leftmost (brown) bar represents the published weights of the competition winning solution. durnov_xview2_2020 The $2^{nd}$ (dark green) bar uses the published code of the winning solution to retrain the model on our hardware. We can see a drop in performance that is not based on any intentional changes to the code. The $3^{rd}$ (dark blue) bar shows our strong baseline model, derived from the winning solution by various simplification steps. It performs slightly worse than our reproduction. The two rightmost bar are, in order, the winning model (light green) and our strong baseline (light blue), retrained on a data split where the test disasters are not seen in training. Generalization proves difficult for both models. Although the strong baseline yields worse results, this difference is small, compared to the performance drop between the two splits. The drop is especially steep for 'minor damage' and 'major damage'. While the 'no damage' and 'destroyed' classes are easy to distinguish, it is difficult to clearly distinguish the two damage levels in between, so seeing the performance drop strongest in those two classes is not surprising. The results are based on individual ResNet34-U-Net models.
  • Figure 2: xBD: Example images from the dataset The images show a location impacted by the 2013 Moore tornado. The dataset contains one image from before and one after the disaster. The annotation indicates the location of buildings and the respective building damage class.
  • Figure 3: Damage class distributions per event. The distribution between events and event categories can differ greatly. For example, wildfire damages are mostly annotated as destroyed, while floods are mostly annotated as minor or major damage. This imbalance adds to the difficulty of training a model to perform well on all classes of all events.
  • Figure 4: Spatial overlap between "socal fire" and "woolsey fire", two disasters that can be found in our dataset and that actually describe the same fire that occurred around Los Angeles in November 2018. Map by https://www.openstreetmap.org/copyright.
  • Figure 5: Spatial proximity between "joplin tornado", "moore tornado" and "midwestern floodings". Those disaster occurred in geographically close areas in the state of Oklahoma and were kept in the same set as prevention. Map by https://www.openstreetmap.org/copyright.