Beyond Few-shot Object Detection: A Detailed Survey
Vishal Chudasama, Hiran Sarkar, Pankaj Wasnik, Vineeth N Balasubramanian, Jayateja Kalla
TL;DR
This survey tackles the challenge of detecting objects with limited labeled data by organizing and evaluating methods across five FSOD settings: standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and FSDAOD. It dissects standard FSOD techniques into six families (meta-learning, metric/classification refinement, proposal quality, attention/feature enhancement, data sampling/scale, and knowledge transfer) and reviews specialized variants (G-FSOD, I-FSOD, O-FSOD, FSDAOD) with representative methods and datasets. The paper analyzes datasets and evaluation protocols (e.g., VOC, COCO, LVIS) and synthesizes results to highlight which approaches excel under different constraints, while discussing practical considerations and limitations. It also outlines key challenges—data scarcity, forgetting, domain shifts, and unknowns—and points to opportunities such as integrating state-of-the-art detectors, unsupervised incremental learning, multimodal data, and foundational model features to advance FSOD in real-world settings.
Abstract
Object detection is a critical field in computer vision focusing on accurately identifying and locating specific objects in images or videos. Traditional methods for object detection rely on large labeled training datasets for each object category, which can be time-consuming and expensive to collect and annotate. To address this issue, researchers have introduced few-shot object detection (FSOD) approaches that merge few-shot learning and object detection principles. These approaches allow models to quickly adapt to new object categories with only a few annotated samples. While traditional FSOD methods have been studied before, this survey paper comprehensively reviews FSOD research with a specific focus on covering different FSOD settings such as standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and domain adaptive FSOD. These approaches play a vital role in reducing the reliance on extensive labeled datasets, particularly as the need for efficient machine learning models continues to rise. This survey paper aims to provide a comprehensive understanding of the above-mentioned few-shot settings and explore the methodologies for each FSOD task. It thoroughly compares state-of-the-art methods across different FSOD settings, analyzing them in detail based on their evaluation protocols. Additionally, it offers insights into their applications, challenges, and potential future directions in the evolving field of object detection with limited data.
