Table of Contents
Fetching ...

Beyond Few-shot Object Detection: A Detailed Survey

Vishal Chudasama, Hiran Sarkar, Pankaj Wasnik, Vineeth N Balasubramanian, Jayateja Kalla

TL;DR

This survey tackles the challenge of detecting objects with limited labeled data by organizing and evaluating methods across five FSOD settings: standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and FSDAOD. It dissects standard FSOD techniques into six families (meta-learning, metric/classification refinement, proposal quality, attention/feature enhancement, data sampling/scale, and knowledge transfer) and reviews specialized variants (G-FSOD, I-FSOD, O-FSOD, FSDAOD) with representative methods and datasets. The paper analyzes datasets and evaluation protocols (e.g., VOC, COCO, LVIS) and synthesizes results to highlight which approaches excel under different constraints, while discussing practical considerations and limitations. It also outlines key challenges—data scarcity, forgetting, domain shifts, and unknowns—and points to opportunities such as integrating state-of-the-art detectors, unsupervised incremental learning, multimodal data, and foundational model features to advance FSOD in real-world settings.

Abstract

Object detection is a critical field in computer vision focusing on accurately identifying and locating specific objects in images or videos. Traditional methods for object detection rely on large labeled training datasets for each object category, which can be time-consuming and expensive to collect and annotate. To address this issue, researchers have introduced few-shot object detection (FSOD) approaches that merge few-shot learning and object detection principles. These approaches allow models to quickly adapt to new object categories with only a few annotated samples. While traditional FSOD methods have been studied before, this survey paper comprehensively reviews FSOD research with a specific focus on covering different FSOD settings such as standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and domain adaptive FSOD. These approaches play a vital role in reducing the reliance on extensive labeled datasets, particularly as the need for efficient machine learning models continues to rise. This survey paper aims to provide a comprehensive understanding of the above-mentioned few-shot settings and explore the methodologies for each FSOD task. It thoroughly compares state-of-the-art methods across different FSOD settings, analyzing them in detail based on their evaluation protocols. Additionally, it offers insights into their applications, challenges, and potential future directions in the evolving field of object detection with limited data.

Beyond Few-shot Object Detection: A Detailed Survey

TL;DR

This survey tackles the challenge of detecting objects with limited labeled data by organizing and evaluating methods across five FSOD settings: standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and FSDAOD. It dissects standard FSOD techniques into six families (meta-learning, metric/classification refinement, proposal quality, attention/feature enhancement, data sampling/scale, and knowledge transfer) and reviews specialized variants (G-FSOD, I-FSOD, O-FSOD, FSDAOD) with representative methods and datasets. The paper analyzes datasets and evaluation protocols (e.g., VOC, COCO, LVIS) and synthesizes results to highlight which approaches excel under different constraints, while discussing practical considerations and limitations. It also outlines key challenges—data scarcity, forgetting, domain shifts, and unknowns—and points to opportunities such as integrating state-of-the-art detectors, unsupervised incremental learning, multimodal data, and foundational model features to advance FSOD in real-world settings.

Abstract

Object detection is a critical field in computer vision focusing on accurately identifying and locating specific objects in images or videos. Traditional methods for object detection rely on large labeled training datasets for each object category, which can be time-consuming and expensive to collect and annotate. To address this issue, researchers have introduced few-shot object detection (FSOD) approaches that merge few-shot learning and object detection principles. These approaches allow models to quickly adapt to new object categories with only a few annotated samples. While traditional FSOD methods have been studied before, this survey paper comprehensively reviews FSOD research with a specific focus on covering different FSOD settings such as standard FSOD, generalized FSOD, incremental FSOD, open-set FSOD, and domain adaptive FSOD. These approaches play a vital role in reducing the reliance on extensive labeled datasets, particularly as the need for efficient machine learning models continues to rise. This survey paper aims to provide a comprehensive understanding of the above-mentioned few-shot settings and explore the methodologies for each FSOD task. It thoroughly compares state-of-the-art methods across different FSOD settings, analyzing them in detail based on their evaluation protocols. Additionally, it offers insights into their applications, challenges, and potential future directions in the evolving field of object detection with limited data.
Paper Structure (39 sections, 2 equations, 8 figures, 11 tables, 1 algorithm)

This paper contains 39 sections, 2 equations, 8 figures, 11 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of traditional few-shot learning setting: pre-trained model adapts to new classes with minimal data samples.
  • Figure 2: Timeline of FSOD efforts: (i) Standard FSOD works are highlighted in brown color; (ii) Generalized FSOD (G-FSOD) works are highlighted in red color; (iii) Incremental FSOD (I-FSOD) works are highlighted in blue color; (iv) Open-set FSOD (O-FSOD) works are highlighted in magenta color; and (v) Domain adaptation FSOD (FSDAOD) works are highlighted in green color.
  • Figure 3: Taxonomy of standard object detection architectures.
  • Figure 4: Two-Stage Object Detector: Architecture design of Faster R-CNN model. Image Courtesy from faster_rcnn.
  • Figure 5: Single-Stage Object Detector: An illustration of the YOLO object detector pipeline. Image Courtesy from yolo_v1.
  • ...and 3 more figures