S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing

Liang Lv; Di Wang; Jing Zhang; Lefei Zhang

S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing

Liang Lv, Di Wang, Jing Zhang, Lefei Zhang

TL;DR

S5 presents a scalable semi-supervised framework for remote-sensing semantic segmentation that leverages a large unlabeled RS corpus (RS4P-1M) and a learned data-curation strategy to enable S4 pre-training (S4P) on RS foundation models. It further introduces MoE-based multi-dataset fine-tuning (MoE-MDF) to efficiently adapt models across multiple RS benchmarks with minimal parameter overhead. Across segmentation and object-detection tasks, S5 achieves state-of-the-art results and demonstrates strong scalability as model size and unlabeled data increase. The work provides a practical path to high-performance RS foundation models and releases datasets, code, and models for community use.

Abstract

Semi-supervised semantic segmentation (S4) has advanced remote sensing (RS) analysis by leveraging unlabeled data through pseudo-labeling and consistency learning. However, existing S4 studies often rely on small-scale datasets and models, limiting their practical applicability. To address this, we propose S5, the first scalable framework for semi-supervised semantic segmentation in RS, which unlocks the potential of vast unlabeled Earth observation data typically underutilized due to costly pixel-level annotations. Built upon existing large-scale RS datasets, S5 introduces a data selection strategy that integrates entropy-based filtering and diversity expansion, resulting in the RS4P-1M dataset. Using this dataset, we systematically scale up S4 into a new pretraining paradigm, S4 pre-training (S4P), to pretrain RS foundation models (RSFMs) of varying sizes on this extensive corpus, significantly boosting their performance on land cover segmentation and object detection tasks. Furthermore, during fine-tuning, we incorporate a Mixture-of-Experts (MoE)-based multi-dataset fine-tuning approach, which enables efficient adaptation to multiple RS benchmarks with fewer parameters. This approach improves the generalization and versatility of RSFMs across diverse RS benchmarks. The resulting RSFMs achieve state-of-the-art performance across all benchmarks, underscoring the viability of scaling semi-supervised learning for RS applications. All datasets, code, and models will be released at https://github.com/MiliLab/S5

S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing

TL;DR

Abstract

S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)