Table of Contents
Fetching ...

Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection

Yaoteng Zhang, Zhou Qing, Junyu Gao, Qi Wang

TL;DR

A novel prompt-decoupled framework called PDP, which innovatively designs a dual-pool prompt decoupling paradigm that achieves state-of-the-art performance on MS-COCO and PASCAL VOC benchmarks, highlighting its potential in balancing stability and plasticity.

Abstract

Incremental Object Detection (IOD) aims to continuously learn new object categories without forgetting previously learned ones. Recently, prompt-based methods have gained popularity for their replay-free design and parameter efficiency. However, due to prompt coupling and prompt drift, these methods often suffer from prompt degradation during continual adaptation. To address these issues, we propose a novel prompt-decoupled framework called PDP. PDP innovatively designs a dual-pool prompt decoupling paradigm, which consists of a shared pool used to capture task-general knowledge for forward transfer, and a private pool used to learn task-specific discriminative features. This paradigm explicitly separates task-general and task-specific prompts, preventing interference between prompts and mitigating prompt coupling. In addition, to counteract prompt drift resulting from inconsistent supervision where old foreground objects are treated as background in subsequent tasks, PDP introduces a Prototypical Pseudo-Label Generation (PPG) module. PPG can dynamically update the class prototype space during training and use the class prototypes to further filter valuable pseudo-labels, maintaining supervisory signal consistency throughout the incremental process. PDP achieves state-of-the-art performance on MS-COCO (with a 9.2\% AP improvement) and PASCAL VOC (with a 3.3\% AP improvement) benchmarks, highlighting its potential in balancing stability and plasticity. The code and dataset are released at: https://github.com/zyt95579/PDP\_IOD/tree/main

Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection

TL;DR

A novel prompt-decoupled framework called PDP, which innovatively designs a dual-pool prompt decoupling paradigm that achieves state-of-the-art performance on MS-COCO and PASCAL VOC benchmarks, highlighting its potential in balancing stability and plasticity.

Abstract

Incremental Object Detection (IOD) aims to continuously learn new object categories without forgetting previously learned ones. Recently, prompt-based methods have gained popularity for their replay-free design and parameter efficiency. However, due to prompt coupling and prompt drift, these methods often suffer from prompt degradation during continual adaptation. To address these issues, we propose a novel prompt-decoupled framework called PDP. PDP innovatively designs a dual-pool prompt decoupling paradigm, which consists of a shared pool used to capture task-general knowledge for forward transfer, and a private pool used to learn task-specific discriminative features. This paradigm explicitly separates task-general and task-specific prompts, preventing interference between prompts and mitigating prompt coupling. In addition, to counteract prompt drift resulting from inconsistent supervision where old foreground objects are treated as background in subsequent tasks, PDP introduces a Prototypical Pseudo-Label Generation (PPG) module. PPG can dynamically update the class prototype space during training and use the class prototypes to further filter valuable pseudo-labels, maintaining supervisory signal consistency throughout the incremental process. PDP achieves state-of-the-art performance on MS-COCO (with a 9.2\% AP improvement) and PASCAL VOC (with a 3.3\% AP improvement) benchmarks, highlighting its potential in balancing stability and plasticity. The code and dataset are released at: https://github.com/zyt95579/PDP\_IOD/tree/main
Paper Structure (16 sections, 12 equations, 5 figures, 8 tables)

This paper contains 16 sections, 12 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Comparison of prompt-based methods. (a) Use task ID to isolate prompt, which forces task-general prompt to be relearned at each task. (b) Missing annotations cause prompt tokens to drift. (c) The shared prompt pool continuously optimizes task-general prompt, and category prototypes guide the generation of pseudo labels for old categories.
  • Figure 2: Overview of our framework at incremental step $t$. Given an image $x$, the query function generates a content-aware query representation by adaptively computing query weights via a ranking function $F_\psi$ and performing weighted aggregation. Subsequently, prompts are retrieved from both the shared and private pool and injected into the decoder layer. In parallel, the teacher model $\Phi_{t-1}$ generates a set of candidate bounding boxes, where potentially valuable ones are projected into the feature space to compute their similarity with class prototypes. This process yields a set of refined, high-quality pseudo-labels to guide the training of the student model $\Phi_t$.
  • Figure 3: The pseudo-label generation process of PPG at stage $t$. PPG dynamically updates the prototypes of new task classes while keeping the old class prototypes frozen. The teacher model $\Phi_{t-1}$ produces a set of candidate detections, where high-confidence predictions are directly regarded as reliable samples. For low-confidence candidates, similarity matching with frozen old class prototypes is performed, and those exceeding a predefined threshold are also considered reliable. Finally, both types of samples are merged to generate high-quality pseudo-labels.
  • Figure 4: Ablation on shared and private pool sizes. Performance of PDP under different $(N_s, N_p)$ configurations on the COCO incremental detection benchmark, reported in $mAP@C$, $mAP@P$, and $mAP@A$ across sequential tasks.
  • Figure 5: Visualization of old-class detection results on the PASCAL VOC dataset.