Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery
Ionut M. Motoi, Leonardo Saraceni, Daniele Nardi, Thomas A. Ciarfuglia
TL;DR
The paper addresses data scarcity and class imbalance in satellite image semantic segmentation and proposes a Cut-and-Paste data augmentation applied to semantic segmentation by extracting per-class instances through connected components. The method is model-free and utilizes multiple images to paste diverse instances onto training samples, enhancing data variability. Evaluated on DynamicEarthNet with a baseline U-Net, the approach increases the mean IoU from $mIoU=37.9$ to $mIoU=44.1$ on the test set, demonstrating improved generalization without extra manual annotations. This technique offers a simple, practical path to bolster remote-sensing segmentation performance and could extend to other tasks like Change Detection.
Abstract
Satellite imagery is crucial for tasks like environmental monitoring and urban planning. Typically, it relies on semantic segmentation or Land Use Land Cover (LULC) classification to categorize each pixel. Despite the advancements brought about by Deep Neural Networks (DNNs), their performance in segmentation tasks is hindered by challenges such as limited availability of labeled data, class imbalance and the inherent variability and complexity of satellite images. In order to mitigate those issues, our study explores the effectiveness of a Cut-and-Paste augmentation technique for semantic segmentation in satellite images. We adapt this augmentation, which usually requires labeled instances, to the case of semantic segmentation. By leveraging the connected components in the semantic segmentation labels, we extract instances that are then randomly pasted during training. Using the DynamicEarthNet dataset and a U-Net model for evaluation, we found that this augmentation significantly enhances the mIoU score on the test set from 37.9 to 44.1. This finding highlights the potential of the Cut-and-Paste augmentation to improve the generalization capabilities of semantic segmentation models in satellite imagery.
