Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images
Lianlei Shan, Weiqiang Wang, Ke Lv, Bin Luo
TL;DR
The paper tackles the high annotation burden in aerial image semantic segmentation by introducing edge-guided labeling units and a fully class-balanced active learning framework. It combines an edge-focused labeling strategy, CLIP-informed initial data balance, performance-based subsequent acquisition, class-aware pseudo-labeling, and balanced supervised contrastive learning to address edge errors and severe class imbalance. Empirical results on Deepglobe, Potsdam, and Vaihingen show substantial improvements over state-of-the-art AL methods and across multiple segmentation backbones, with ablations validating each component’s contribution. The work also establishes a fair, strong benchmark for future AL research in aerial imagery and highlights practical gains in labeling efficiency and segmentation accuracy.
Abstract
Semantic segmentation requires pixel-level annotation, which is time-consuming. Active Learning (AL) is a promising method for reducing data annotation costs. Due to the gap between aerial and natural images, the previous AL methods are not ideal, mainly caused by unreasonable labeling units and the neglect of class imbalance. Previous labeling units are based on images or regions, which does not consider the characteristics of segmentation tasks and aerial images, i.e., the segmentation network often makes mistakes in the edge region, and the edge of aerial images is often interlaced and irregular. Therefore, an edge-guided labeling unit is proposed and supplemented as the new unit. On the other hand, the class imbalance is severe, manifested in two aspects: the aerial image is seriously imbalanced, and the AL strategy does not fully consider the class balance. Both seriously affect the performance of AL in aerial images. We comprehensively ensure class balance from all steps that may occur imbalance, including initial labeled data, subsequent labeled data, and pseudo-labels. Through the two improvements, our method achieves more than 11.2\% gains compared to state-of-the-art methods on three benchmark datasets, Deepglobe, Potsdam, and Vaihingen, and more than 18.6\% gains compared to the baseline. Sufficient ablation studies show that every module is indispensable. Furthermore, we establish a fair and strong benchmark for future research on AL for aerial image segmentation.
