Segment Using Just One Example
Pratik Vora, Sudipan Saha
TL;DR
This work tackles one-shot semantic segmentation in Earth observation by using a single example image with a known mask to segment the same target in a query image without any training. It leverages Segment Anything (SAM) with four image-based, text-free prompt strategies applied to a stitched key–query image, and combines multiple SAM runs via ensemble and confidence-weighted aggregation, followed by morphological post-processing. The approach is evaluated on building and car segmentation from the ISPRS Potsdam dataset, demonstrating that building segmentation benefits from the method while car segmentation remains challenging due to smaller object size; the method outperforms a fine-tuned UNet baseline in this setting. This indicates the potential of foundation-model-based, data-efficient segmentation for rapid, deployment-ready Earth observation tasks, with future work aimed at improving small-object segmentation and handling lower-resolution imagery.
Abstract
Semantic segmentation is an important topic in computer vision with many relevant application in Earth observation. While supervised methods exist, the constraints of limited annotated data has encouraged development of unsupervised approaches. However, existing unsupervised methods resemble clustering and cannot be directly mapped to explicit target classes. In this paper, we deal with single shot semantic segmentation, where one example for the target class is provided, which is used to segment the target class from query/test images. Our approach exploits recently popular Segment Anything (SAM), a promptable foundation model. We specifically design several techniques to automatically generate prompts from the only example/key image in such a way that the segmentation is successfully achieved on a stitch or concatenation of the example/key and query/test images. Proposed technique does not involve any training phase and just requires one example image to grasp the concept. Furthermore, no text-based prompt is required for the proposed method. We evaluated the proposed techniques on building and car classes.
