Few Shot Semantic Segmentation: a review of methodologies, benchmarks, and open challenges
Nico Catalano, Matteo Matteucci
TL;DR
Few-Shot Semantic Segmentation addresses the challenge of segmenting novel classes from very limited labeled data by integrating principles from few-shot learning with pixel-level segmentation. The paper surveys three main methodological strands—conditional networks, prototypical networks, and latent-space optimization—and discusses how foundation-model strategies, including prompt engineering, multimodal cues, and generalist models, are redefining the field. It aggregates standard benchmarks, datasets, and metrics while highlighting open challenges such as domain shift, cross-domain generalization, and continual learning, and it analyzes latent representations from GANs, contrastive learning, and VAEs. The review also emphasizes the practical promise of vision foundation models like SAM and CLIP for FSS, particularly in data-scarce domains relevant to medicine and agriculture, and it outlines directions for future research toward robust, scalable, and adaptable FSS systems.
Abstract
Semantic segmentation, vital for applications ranging from autonomous driving to robotics, faces significant challenges in domains where collecting large annotated datasets is difficult or prohibitively expensive. In such contexts, such as medicine and agriculture, the scarcity of training images hampers progress. Introducing Few-Shot Semantic Segmentation, a novel task in computer vision, which aims at designing models capable of segmenting new semantic classes with only a few examples. This paper consists of a comprehensive survey of Few-Shot Semantic Segmentation, tracing its evolution and exploring various model designs, from the more popular conditional and prototypical networks to the more niche latent space optimization methods, presenting also the new opportunities offered by recent foundational models. Through a chronological narrative, we dissect influential trends and methodologies, providing insights into their strengths and limitations. A temporal timeline offers a visual roadmap, marking key milestones in the field's progression. Complemented by quantitative analyses on benchmark datasets and qualitative showcases of seminal works, this survey equips readers with a deep understanding of the topic. By elucidating current challenges, state-of-the-art models, and prospects, we aid researchers and practitioners in navigating the intricacies of Few-Shot Semantic Segmentation and provide ground for future development.
