Prototype-Driven Adaptation for Few-Shot Object Detection
Yushen Huang, Zhiming Wang
TL;DR
Few-shot object detection suffers base-class bias and unstable calibration when novel samples are scarce. The paper proposes Prototype-Driven Alignment (PDA), a plug-in metric head that builds a task-adaptive prototype memory from the support set, uses a learnable identity projection, and optionally applies prototype-conditioned RoI alignment, with best-of-$K$ scoring and temperature-scaled fusion to the detector logits. PDA is designed to be protocol-friendly: memory is initialized from support, EMA updates are optional and do not introduce per-class parameters, and memory is frozen at inference. Experiments on VOC FSOD and GFSOD show consistent improvements in novel-class AP with negligible overhead and little impact on base-class performance, indicating the gains stem from improved metric geometry rather than memorization.
Abstract
Few-shot object detection (FSOD) often suffers from base-class bias and unstable calibration when only a few novel samples are available. We propose Prototype-Driven Alignment (PDA), a lightweight, plug-in metric head for DeFRCN that provides a prototype-based "second opinion" complementary to the linear classifier. PDA maintains support-only prototypes in a learnable identity-initialized projection space and optionally applies prototype-conditioned RoI alignment to reduce geometric mismatch. During fine-tuning, prototypes can be adapted via exponential moving average(EMA) updates on labeled foreground RoIs-without introducing class-specific parameters-and are frozen at inference to ensure strict protocol compliance. PDA employs a best-of-K matching scheme to capture intra-class multi-modality and temperature-scaled fusion to combine metric similarities with detector logits. Experiments on VOC FSOD and GFSOD benchmarks show that PDA consistently improves novel-class performance with minimal impact on base classes and negligible computational overhead.
