Table of Contents
Fetching ...

MultIOD: Rehearsal-free Multihead Incremental Object Detector

Eden Belouadah, Arnaud Dapogny, Kevin Bailly

TL;DR

MultIOD tackles the challenge of class-incremental object detection under rehearsal-free, resource-constrained conditions by embedding a multihead feature pyramid and per-class prediction heads within CenterNet. The approach freezes previously learned components while training new class heads, employs transfer learning to maintain knowledge transfer across states, and applies class-wise NMS to reduce duplicates without requiring past data. Empirical results on Pascal VOC datasets show MultIOD outperforms distillation-based CenterNet methods while using only the current model, and it achieves favorable efficiency due to parameter reduction via the fixed representation strategy. This work provides a practical, fast, anchor-free CIOD solution with clear pathways for scalability and improved transfer in real-world streaming environments.

Abstract

Class-Incremental learning (CIL) refers to the ability of artificial agents to integrate new classes as they appear in a stream. It is particularly interesting in evolving environments where agents have limited access to memory and computational resources. The main challenge of incremental learning is catastrophic forgetting, the inability of neural networks to retain past knowledge when learning a new one. Unfortunately, most existing class-incremental methods for object detection are applied to two-stage algorithms such as Faster-RCNN, and rely on rehearsal memory to retain past knowledge. We argue that those are not suitable in resource-limited environments, and more effort should be dedicated to anchor-free and rehearsal-free object detection. In this paper, we propose MultIOD, a class-incremental object detector based on CenterNet. Our contributions are: (1) we propose a multihead feature pyramid and multihead detection architecture to efficiently separate class representations, (2) we employ transfer learning between classes learned initially and those learned incrementally to tackle catastrophic forgetting, and (3) we use a class-wise non-max-suppression as a post-processing technique to remove redundant boxes. Results show that our method outperforms state-of-the-art methods on two Pascal VOC datasets, while only saving the model in its current state, contrary to other distillation-based counterparts.

MultIOD: Rehearsal-free Multihead Incremental Object Detector

TL;DR

MultIOD tackles the challenge of class-incremental object detection under rehearsal-free, resource-constrained conditions by embedding a multihead feature pyramid and per-class prediction heads within CenterNet. The approach freezes previously learned components while training new class heads, employs transfer learning to maintain knowledge transfer across states, and applies class-wise NMS to reduce duplicates without requiring past data. Empirical results on Pascal VOC datasets show MultIOD outperforms distillation-based CenterNet methods while using only the current model, and it achieves favorable efficiency due to parameter reduction via the fixed representation strategy. This work provides a practical, fast, anchor-free CIOD solution with clear pathways for scalability and improved transfer in real-world streaming environments.

Abstract

Class-Incremental learning (CIL) refers to the ability of artificial agents to integrate new classes as they appear in a stream. It is particularly interesting in evolving environments where agents have limited access to memory and computational resources. The main challenge of incremental learning is catastrophic forgetting, the inability of neural networks to retain past knowledge when learning a new one. Unfortunately, most existing class-incremental methods for object detection are applied to two-stage algorithms such as Faster-RCNN, and rely on rehearsal memory to retain past knowledge. We argue that those are not suitable in resource-limited environments, and more effort should be dedicated to anchor-free and rehearsal-free object detection. In this paper, we propose MultIOD, a class-incremental object detector based on CenterNet. Our contributions are: (1) we propose a multihead feature pyramid and multihead detection architecture to efficiently separate class representations, (2) we employ transfer learning between classes learned initially and those learned incrementally to tackle catastrophic forgetting, and (3) we use a class-wise non-max-suppression as a post-processing technique to remove redundant boxes. Results show that our method outperforms state-of-the-art methods on two Pascal VOC datasets, while only saving the model in its current state, contrary to other distillation-based counterparts.
Paper Structure (32 sections, 7 equations, 7 figures, 12 tables, 1 algorithm)

This paper contains 32 sections, 7 equations, 7 figures, 12 tables, 1 algorithm.

Figures (7)

  • Figure 1: Mean Average Precision (IoU=0.5) on VOC0712 using different number of base classes ($B$) and incremental classes ($I$).
  • Figure 2: Illustration of $MultIOD$, depicting two toy states: one initial state (on the left), and one incremental state (on the right). The model is trained classically in the initial state using data from classes C1 and C2, while in the incremental state, it is updated using data from classes C3 and C4 only. The backbone, the feature pyramid of classes C1 and C2, as well as their detection heads are frozen once these classes are learned. Only the feature pyramid of classes C3 and C4, and their detection heads are trained in the incremental state.
  • Figure 3: Illustration of background interference
  • Figure 4: Pascal VOC incremental protocol
  • Figure 5: Architecture of one Feature Pyramid in $MultIOD$
  • ...and 2 more figures