Incremental Object Detection with Prompt-based Methods
Matthias Neuwirth-Trapp, Maarten Bieshaar, Danda Pani Paudel, Luc Van Gool
TL;DR
The paper investigates applying visual prompt tuning to domain incremental object detection (IOD) and benchmarks three prompt-based IL methods (L2P, DualPrompt, S-Prompt) on the challenging D-RICO dataset. It reveals that prompt-based methods generally underperform compared to replay-based baselines, with DualPrompt performing best among prompts, especially when the output layer is fixed or when deeper prompting is used. The study highlights that replaying a small fraction of previous data provides a strong, simple baseline and that prompt length and initialization critically influence performance, with smaller initial values and longer prompts benefiting deep prompting. Collectively, the work provides actionable insights and baselines to guide future development of prompt-based IL for object detection and underscores the continuing importance of rehearsal strategies in practical IL scenarios.
Abstract
Visual prompt-based methods have seen growing interest in incremental learning (IL) for image classification. These approaches learn additional embedding vectors while keeping the model frozen, making them efficient to train. However, no prior work has applied such methods to incremental object detection (IOD), leaving their generalizability unclear. In this paper, we analyze three different prompt-based methods under a complex domain-incremental learning setting. We additionally provide a wide range of reference baselines for comparison. Empirically, we show that the prompt-based approaches we tested underperform in this setting. However, a strong yet practical method, combining visual prompts with replaying a small portion of previous data, achieves the best results. Together with additional experiments on prompt length and initialization, our findings offer valuable insights for advancing prompt-based IL in IOD.
