Table of Contents
Fetching ...

Parameterized Prompt for Incremental Object Detection

Zijia An, Boyu Diao, Ruiqi Liu, Libo Huang, Chuanguang Yang, Fei Wang, Zhulin An, Yongjun Xu

TL;DR

This work tackles catastrophic forgetting in incremental object detection by addressing prompts pool confusion caused by co-occurring objects. It introduces Parameterized Prompts for Incremental Object Detection (P$^2$IOD), which replaces static prompts pools with adaptive, MLP-based prompts (parameterized prompts) injected into the decoder, and adds a parameterized prompt fusion mechanism to constrain updates across tasks. P$^2$IOD also employs pseudo-labeling to mine latent knowledge from co-occurring objects in training images. Across VOC2007 and COCO, the approach yields state-of-the-art performance among baselines, demonstrating improved stability-plasticity trade-offs and reduced prompt-related interference in IOD.

Abstract

Recent studies have demonstrated that incorporating trainable prompts into pretrained models enables effective incremental learning. However, the application of prompts in incremental object detection (IOD) remains underexplored. Existing prompts pool based approaches assume disjoint class sets across incremental tasks, which are unsuitable for IOD as they overlook the inherent co-occurrence phenomenon in detection images. In co-occurring scenarios, unlabeled objects from previous tasks may appear in current task images, leading to confusion in prompts pool. In this paper, we hold that prompt structures should exhibit adaptive consolidation properties across tasks, with constrained updates to prevent catastrophic forgetting. Motivated by this, we introduce Parameterized Prompts for Incremental Object Detection (P$^2$IOD). Leveraging neural networks global evolution properties, P$^2$IOD employs networks as the parameterized prompts to adaptively consolidate knowledge across tasks. To constrain prompts structure updates, P$^2$IOD further engages a parameterized prompts fusion strategy. Extensive experiments on PASCAL VOC2007 and MS COCO datasets demonstrate that P$^2$IOD's effectiveness in IOD and achieves the state-of-the-art performance among existing baselines.

Parameterized Prompt for Incremental Object Detection

TL;DR

This work tackles catastrophic forgetting in incremental object detection by addressing prompts pool confusion caused by co-occurring objects. It introduces Parameterized Prompts for Incremental Object Detection (PIOD), which replaces static prompts pools with adaptive, MLP-based prompts (parameterized prompts) injected into the decoder, and adds a parameterized prompt fusion mechanism to constrain updates across tasks. PIOD also employs pseudo-labeling to mine latent knowledge from co-occurring objects in training images. Across VOC2007 and COCO, the approach yields state-of-the-art performance among baselines, demonstrating improved stability-plasticity trade-offs and reduced prompt-related interference in IOD.

Abstract

Recent studies have demonstrated that incorporating trainable prompts into pretrained models enables effective incremental learning. However, the application of prompts in incremental object detection (IOD) remains underexplored. Existing prompts pool based approaches assume disjoint class sets across incremental tasks, which are unsuitable for IOD as they overlook the inherent co-occurrence phenomenon in detection images. In co-occurring scenarios, unlabeled objects from previous tasks may appear in current task images, leading to confusion in prompts pool. In this paper, we hold that prompt structures should exhibit adaptive consolidation properties across tasks, with constrained updates to prevent catastrophic forgetting. Motivated by this, we introduce Parameterized Prompts for Incremental Object Detection (PIOD). Leveraging neural networks global evolution properties, PIOD employs networks as the parameterized prompts to adaptively consolidate knowledge across tasks. To constrain prompts structure updates, PIOD further engages a parameterized prompts fusion strategy. Extensive experiments on PASCAL VOC2007 and MS COCO datasets demonstrate that PIOD's effectiveness in IOD and achieves the state-of-the-art performance among existing baselines.

Paper Structure

This paper contains 25 sections, 9 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Heatmap showing similarity weights between objects and task-specific prompts after four incremental learning steps. Gray indicates irrelevance, red indicates positive correlation, and blue indicates negative correlation. Due to the co-occurrence phenomenon, task 1 objects exhibit high similarity not only with their corresponding task prompt but also with prompts from other tasks.
  • Figure 2: The overall framework of P$^2$IOD. To address the issue of prompt pool confusion, P$^2$IOD redesigns the prompt pool as a parameterized prompt structure consisting of multi-layer perceptron (MLP) bottlenecks. P$^2$IOD introduces independent parameterized prompts at each decoder layer to ensure the diversity of prompts. To further alleviate the problem of catastrophic forgetting, P$^2$IOD proposes a parameterized prompt fusion mechanism, which adds an additional fusion process after each incremental training process to better preserve task information.
  • Figure 3: Visualized comparison between P$^2$IOD and MD-DETR. MD-DETR exhibits more false positives and a faster decline in the positive target’s confidence than P$^2$IOD, indicating the impact of prompts pool confusion.
  • Figure 4: Average Precision ($A{P_{50}}$, %) and parameters (M) on different hidden layer dimensions in the parameterized prompt structure on PASCAL VOC2007 under the 5+5+5+5 setting.
  • Figure 5: Distribution similarity of prompts across different decoder layers in MD-DETR and P$^2$IOD. A larger A-MMD value indicates a more significant prompt diversity.