Table of Contents
Fetching ...

Fine-Grained Prototypes Distillation for Few-Shot Object Detection

Zichen Wang, Bo Yang, Haonan Yue, Zhenghao Ma

TL;DR

This paper proposes to distill the most representative support features into fine-grained prototypes and proposes Balanced Class-Agnostic Sampling strategy and Non-Linear Fusion module from differenct perspectives, which are complementary to each other and depict the high-level feature relations more effectively.

Abstract

Few-shot object detection (FSOD) aims at extending a generic detector for novel object detection with only a few training examples. It attracts great concerns recently due to the practical meanings. Meta-learning has been demonstrated to be an effective paradigm for this task. In general, methods based on meta-learning employ an additional support branch to encode novel examples (a.k.a. support images) into class prototypes, which are then fused with query branch to facilitate the model prediction. However, the class-level prototypes are difficult to precisely generate, and they also lack detailed information, leading to instability in performance.New methods are required to capture the distinctive local context for more robust novel object detection. To this end, we propose to distill the most representative support features into fine-grained prototypes. These prototypes are then assigned into query feature maps based on the matching results, modeling the detailed feature relations between two branches. This process is realized by our Fine-Grained Feature Aggregation (FFA) module. Moreover, in terms of high-level feature fusion, we propose Balanced Class-Agnostic Sampling (B-CAS) strategy and Non-Linear Fusion (NLF) module from differenct perspectives. They are complementary to each other and depict the high-level feature relations more effectively. Extensive experiments on PASCAL VOC and MS COCO benchmarks show that our method sets a new state-of-the-art performance in most settings. Our code is available at https://github.com/wangchen1801/FPD.

Fine-Grained Prototypes Distillation for Few-Shot Object Detection

TL;DR

This paper proposes to distill the most representative support features into fine-grained prototypes and proposes Balanced Class-Agnostic Sampling strategy and Non-Linear Fusion module from differenct perspectives, which are complementary to each other and depict the high-level feature relations more effectively.

Abstract

Few-shot object detection (FSOD) aims at extending a generic detector for novel object detection with only a few training examples. It attracts great concerns recently due to the practical meanings. Meta-learning has been demonstrated to be an effective paradigm for this task. In general, methods based on meta-learning employ an additional support branch to encode novel examples (a.k.a. support images) into class prototypes, which are then fused with query branch to facilitate the model prediction. However, the class-level prototypes are difficult to precisely generate, and they also lack detailed information, leading to instability in performance.New methods are required to capture the distinctive local context for more robust novel object detection. To this end, we propose to distill the most representative support features into fine-grained prototypes. These prototypes are then assigned into query feature maps based on the matching results, modeling the detailed feature relations between two branches. This process is realized by our Fine-Grained Feature Aggregation (FFA) module. Moreover, in terms of high-level feature fusion, we propose Balanced Class-Agnostic Sampling (B-CAS) strategy and Non-Linear Fusion (NLF) module from differenct perspectives. They are complementary to each other and depict the high-level feature relations more effectively. Extensive experiments on PASCAL VOC and MS COCO benchmarks show that our method sets a new state-of-the-art performance in most settings. Our code is available at https://github.com/wangchen1801/FPD.
Paper Structure (31 sections, 10 equations, 12 figures, 6 tables)

This paper contains 31 sections, 10 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Overview of the proposed method, which we denote as FPD. In addition to class-level prototypes, we distill representative detailed features into fine-grained prototypes, enabling more robust novel object detection.
  • Figure 2: The overall architecture of our method. FFA and NLF are proposed to improve the performance.
  • Figure 3: The architecture of the Fine-Grained Feature Aggregation (FFA) module. It can be divided into Prototypes Distillation and Prototypes Assignment.
  • Figure 4: Visualization of the detection results on novel classes.
  • Figure 5: Ablation study on the number of feature quries.
  • ...and 7 more figures