Table of Contents
Fetching ...

ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

Song Zhang, Qingzhong Wang, Junyi Liu, Haoyi Xiong

TL;DR

ALPS tackles the shortage of pixel-level annotations in remote sensing by auto-labeling large unlabeled RS datasets with the vanilla Segment Anything Model (SAM) and distilling mask-level features into pseudo-labels through online clustering. It builds two pseudo-labeled RS datasets, iSAID-PL and SAMRS-PL, to pre-train segmentation models, and demonstrates gains on RS benchmarks such as iSAID and ISPRS Potsdam, with additional validation in a medical domain. Ablation studies confirm the value of mask-class association over binary-mask baselines, and show that larger pre-training data and steps further improve performance. The approach offers a scalable, prompts-free pathway to leverage SAM for pre-training in segmentation across domains, including remote sensing and medical imaging.

Abstract

In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM) to predict precise pseudo-labels for RS images without necessitating prior annotations or additional prompts. The proposed pipeline significantly reduces the labor and resource demands traditionally associated with annotating RS datasets. By constructing two comprehensive pseudo-labeled RS datasets via ALPS for pre-training purposes, our approach enhances the performance of downstream tasks across various benchmarks, including iSAID and ISPRS Potsdam. Experiments demonstrate the effectiveness of our framework, showcasing its ability to generalize well across multiple tasks even under the scarcity of extensively annotated datasets, offering a scalable solution to automatic segmentation and annotation challenges in the field. In addition, the proposed a pipeline is flexible and can be applied to medical image segmentation, remarkably boosting the performance. Note that ALPS utilizes pre-trained SAM to semi-automatically annotate RS images without additional manual annotations. Though every component in the pipeline has bee well explored, integrating clustering algorithms with SAM and novel pseudo-label alignment significantly enhances RS segmentation, as an off-the-shelf tool for pre-training data preparation. Our source code is available at: https://github.com/StriveZs/ALPS.

ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

TL;DR

ALPS tackles the shortage of pixel-level annotations in remote sensing by auto-labeling large unlabeled RS datasets with the vanilla Segment Anything Model (SAM) and distilling mask-level features into pseudo-labels through online clustering. It builds two pseudo-labeled RS datasets, iSAID-PL and SAMRS-PL, to pre-train segmentation models, and demonstrates gains on RS benchmarks such as iSAID and ISPRS Potsdam, with additional validation in a medical domain. Ablation studies confirm the value of mask-class association over binary-mask baselines, and show that larger pre-training data and steps further improve performance. The approach offers a scalable, prompts-free pathway to leverage SAM for pre-training in segmentation across domains, including remote sensing and medical imaging.

Abstract

In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM) to predict precise pseudo-labels for RS images without necessitating prior annotations or additional prompts. The proposed pipeline significantly reduces the labor and resource demands traditionally associated with annotating RS datasets. By constructing two comprehensive pseudo-labeled RS datasets via ALPS for pre-training purposes, our approach enhances the performance of downstream tasks across various benchmarks, including iSAID and ISPRS Potsdam. Experiments demonstrate the effectiveness of our framework, showcasing its ability to generalize well across multiple tasks even under the scarcity of extensively annotated datasets, offering a scalable solution to automatic segmentation and annotation challenges in the field. In addition, the proposed a pipeline is flexible and can be applied to medical image segmentation, remarkably boosting the performance. Note that ALPS utilizes pre-trained SAM to semi-automatically annotate RS images without additional manual annotations. Though every component in the pipeline has bee well explored, integrating clustering algorithms with SAM and novel pseudo-label alignment significantly enhances RS segmentation, as an off-the-shelf tool for pre-training data preparation. Our source code is available at: https://github.com/StriveZs/ALPS.
Paper Structure (24 sections, 4 equations, 5 figures, 6 tables)

This paper contains 24 sections, 4 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Some examples of SAM segmentation results and ALPS segmentation results on remote sensing images. (a) some remote sensing images obtained from the SAMRS samrs, (b) segmentation results predicted by the SAM sam without any prompts, (c) semantic segmentation results generated by our ALPS without any prompts.
  • Figure 2: The illustration of our ALPS framework. Our mainly framework consists of two parts, which can obtain the binary mask set and PCL for each mask respectively.
  • Figure 3: Since we have used many types of colors from CoCo coco, there may be very similar colors, but the classes they each represent are different. Moreover, these pseudo-labeled RS datasets adopt different random color sets.
  • Figure 4: (a) Qualitative results of different pre-training datasets constrcuted by different unsupervised manners. (b) Some examples of binary masks generated by vanilla SAM.
  • Figure 5: Some examples of Pseudo-Labeled results on ATLAS2023 dataset.