Table of Contents
Fetching ...

Hierarchical Semi-Supervised Active Learning for Remote Sensing

Wei Huang, Zhitong Xiong, Chenying Liu, Xiao Xiang Zhu

TL;DR

The paper tackles label scarcity in remote sensing by integrating semi-supervised learning with a novel hierarchical active learning strategy in an iterative loop. It introduces HAL to achieve scalable, diverse, and uncertainty-aware sample querying, while SSL (via weak-to-strong self-training) expands the effective training set using unlabeled data. On UCM, AID, and NWPU-RESISC45, HSSAL consistently outperforms SSL-only and AL-only baselines, achieving around 95% of fully-supervised accuracy with as little as 2–8% labeled data, demonstrating strong label efficiency. The approach leverages a DINOv2-based encoder, gradient-based uncertainty, and spectral clustering to efficiently explore data manifolds, suggesting broad applicability to RS tasks and potential extension to dense prediction problems.

Abstract

The performance of deep learning models in remote sensing (RS) strongly depends on the availability of high-quality labeled data. However, collecting large-scale annotations is costly and time-consuming, while vast amounts of unlabeled imagery remain underutilized. To address this challenge, we propose a Hierarchical Semi-Supervised Active Learning (HSSAL) framework that integrates semi-supervised learning (SSL) and a novel hierarchical active learning (HAL) in a closed iterative loop. In each iteration, SSL refines the model using both labeled data through supervised learning and unlabeled data via weak-to-strong self-training, improving feature representation and uncertainty estimation. Guided by the refined representations and uncertainty cues of unlabeled samples, HAL then conducts sample querying through a progressive clustering strategy, selecting the most informative instances that jointly satisfy the criteria of scalability, diversity, and uncertainty. This hierarchical process ensures both efficiency and representativeness in sample selection. Extensive experiments on three benchmark RS scene classification datasets, including UCM, AID, and NWPU-RESISC45, demonstrate that HSSAL consistently outperforms SSL- or AL-only baselines. Remarkably, with only 8%, 4%, and 2% labeled training data on UCM, AID, and NWPU-RESISC45, respectively, HSSAL achieves over 95% of fully-supervised accuracy, highlighting its superior label efficiency through informativeness exploitation of unlabeled data. Our code will be publicly available.

Hierarchical Semi-Supervised Active Learning for Remote Sensing

TL;DR

The paper tackles label scarcity in remote sensing by integrating semi-supervised learning with a novel hierarchical active learning strategy in an iterative loop. It introduces HAL to achieve scalable, diverse, and uncertainty-aware sample querying, while SSL (via weak-to-strong self-training) expands the effective training set using unlabeled data. On UCM, AID, and NWPU-RESISC45, HSSAL consistently outperforms SSL-only and AL-only baselines, achieving around 95% of fully-supervised accuracy with as little as 2–8% labeled data, demonstrating strong label efficiency. The approach leverages a DINOv2-based encoder, gradient-based uncertainty, and spectral clustering to efficiently explore data manifolds, suggesting broad applicability to RS tasks and potential extension to dense prediction problems.

Abstract

The performance of deep learning models in remote sensing (RS) strongly depends on the availability of high-quality labeled data. However, collecting large-scale annotations is costly and time-consuming, while vast amounts of unlabeled imagery remain underutilized. To address this challenge, we propose a Hierarchical Semi-Supervised Active Learning (HSSAL) framework that integrates semi-supervised learning (SSL) and a novel hierarchical active learning (HAL) in a closed iterative loop. In each iteration, SSL refines the model using both labeled data through supervised learning and unlabeled data via weak-to-strong self-training, improving feature representation and uncertainty estimation. Guided by the refined representations and uncertainty cues of unlabeled samples, HAL then conducts sample querying through a progressive clustering strategy, selecting the most informative instances that jointly satisfy the criteria of scalability, diversity, and uncertainty. This hierarchical process ensures both efficiency and representativeness in sample selection. Extensive experiments on three benchmark RS scene classification datasets, including UCM, AID, and NWPU-RESISC45, demonstrate that HSSAL consistently outperforms SSL- or AL-only baselines. Remarkably, with only 8%, 4%, and 2% labeled training data on UCM, AID, and NWPU-RESISC45, respectively, HSSAL achieves over 95% of fully-supervised accuracy, highlighting its superior label efficiency through informativeness exploitation of unlabeled data. Our code will be publicly available.

Paper Structure

This paper contains 32 sections, 24 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of HSSAL for one round. Each round alternates SSL and HAL in a cooperative loop. SSL: the model learns on labeled data via supervised loss and on unlabeled data via weak-to-strong self-training. HAL: systematically integrates scalability, diversity, and uncertainty into a unified sample selection pipeline. It proceeds through four structured steps: (1) feature extraction and uncertainty estimation; (2) mini-batch partitioning for scalable processing; (3) spectral clustering with uncertainty-aware sampling; and (4) sample annotation and dataset update. By hierarchically organizing these steps, HAL achieves scalable computation, diverse representation coverage, and uncertainty-driven querying. Through iterative SSL–HAL interaction, HSSAL progressively refines both feature representations and labeled sets, exploiting low-uncertainty samples for learning and high-uncertainty ones for querying.
  • Figure 2: Runtime of HAL under different mini-batch sizes on 8% labeled NWPU-RESISC45. Larger batches significantly increase runtime due to the super-quadratic complexity of spectral clustering.
  • Figure 3: Distribution of spectral cluster sizes under different values of $\lambda$, which controls the number of clusters.
  • Figure 4: Performance comparison of HSSAL variants on UCM and AID datasets across different labeling ratios.
  • Figure 5: Distribution visualization of unlabeled training samples under the 2% labeled setting on the AID dataset using the UMAP technique. (a) True distribution of unlabeled samples based on Random-trained model; (b) Spectral clustering results with 140 clusters (for clarity, only 140 clusters are shown instead of 420 corresponding to $\lambda=3$); (c) Unlabeled samples selected by the Random strategy based on Random+FixMatch-trained model; (d) Unlabeled samples selected by the HAL strategy based on Random+FixMatch-trained model.