Breaking the Box: Enhancing Remote Sensing Image Segmentation with Freehand Sketches
Ying Zang, Yuncan Gao, Jiangi Zhang, Yuangi Hu, Runlong Cao, Lanyun Zhu, Qi Zhu, Deyi Ji, Renjun Xu, Tianrun Chen
TL;DR
The paper tackles the challenge of remote sensing image segmentation under extreme scale and viewpoint variability by introducing freehand sketch prompting as a more intuitive interaction than points or boxes. It introduces the LTL-Sensing dataset, pairing human sketches with remote sensing images and GT masks, and presents LTL-Net, a sketch-aware segmentation model that fuses sketch and image features and employs a masked attention mechanism and a multi-prompt transport module to robustly map multiple sketches to image regions. Empirical results show that sketch-guided prompting substantially improves segmentation accuracy and robustness over SAM and related sketch-based methods, across object sizes and scenes, highlighting the potential for more effective human-AI collaboration in environmental monitoring, disaster response, and urban analysis. Collectively, the approach advances zero-shot interactive segmentation in remote sensing by combining intuitive user input, a dedicated annotated dataset, and a novel network design that handles sketch variability through augmentation and optimal transport-based multi-prompt alignment.
Abstract
This work advances zero-shot interactive segmentation for remote sensing imagery through three key contributions. First, we propose a novel sketch-based prompting method, enabling users to intuitively outline objects, surpassing traditional point or box prompts. Second, we introduce LTL-Sensing, the first dataset pairing human sketches with remote sensing imagery, setting a benchmark for future research. Third, we present LTL-Net, a model featuring a multi-input prompting transport module tailored for freehand sketches. Extensive experiments show our approach significantly improves segmentation accuracy and robustness over state-of-the-art methods like SAM, fostering more intuitive human-AI collaboration in remote sensing analysis and enhancing its applications.
