UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

Shuo Ni; Di Wang; He Chen; Haonan Guo; Ning Zhang; Jing Zhang

UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

Shuo Ni, Di Wang, He Chen, Haonan Guo, Ning Zhang, Jing Zhang

TL;DR

GeoSeg-1M and GeoSeg-Bench address the lack of large-scale open-world instruction-driven segmentation data for geospatial scenes by unifying referring, interactive, and reasoning tasks. The authors introduce UniGeoSeg, a unified baseline with task-aware text enhancement (TATE), latent knowledge memory (LKM), and progressive task scheduling (PTS) to enable multi-task learning and robust generalization. Experiments show state-of-the-art performance on GeoSeg-Bench and strong zero-shot and cross-dataset generalization across RS benchmarks. The resources provide a scalable foundation for instruction-driven geospatial understanding and open-world RS intelligences.

Abstract

Instruction-driven segmentation in remote sensing generates masks from guidance, offering great potential for accessible and generalizable applications. However, existing methods suffer from fragmented task formulations and limited instruction data, hindering effective understanding and generalization. To address these issues, we introduce GeoSeg-1M, the first million-scale dataset for remote sensing instruction-driven segmentation, constructed via an automatic mask filtering and instruction generation pipeline that synthesizes referring, interactive, and reasoning segmentation instructions from multiple public datasets. GeoSeg-1M contains 590K images, 117 categories, and 1.1M image-mask-instruction triplets. Building upon this foundation, we further curate GeoSeg-Bench, a challenging benchmark designed to evaluate contextual understanding and reasoning capabilities across diverse instruction-driven tasks and complex geospatial scenes. Furthermore, we present UniGeoSeg, a unified framework that serves as a strong baseline, incorporating task-aware text enhancement, latent knowledge memory, and a progressive training strategy to facilitate multi-task learning. Extensive experiments demonstrate the state-of-the-art performance of UniGeoSeg across GeoSeg-Bench and diverse public benchmarks, while exhibiting strong zero-shot generalization. Datasets and source code were released at https://github.com/MiliLab/UniGeoSeg.

UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

TL;DR

Abstract

UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)