Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection
Gaurav Bhatt, James Ross, Leonid Sigal
TL;DR
This work tackles catastrophic forgetting in continual object detection by introducing MD-DETR, a memory-augmented transformer that adapts a Deformable DETR backbone while preserving past knowledge through a dedicated memory module and a localized query mechanism. It adds continual optimization strategies, including memory chunk freezing, gradient masking, and background thresholding to counter background relegation, and employs a joint training objective $\mathcal{L} = \mathcal{L}_{detr} + \lambda_Q \mathcal{L}_Q$. Empirically, MD-DETR achieves state-of-the-art results on MS-COCO and PASCAL-VOC in a replay-free setting, with about $5-7\%$ improvements and up to $\sim10\%$ gains on challenging tasks, outperforming replay-based baselines. The work also provides detailed ablations and qualitative analyses, highlighting the effectiveness of memory-based retrieval for continual detection and outlining remaining challenges such as bounding-box deformation and confidence drift for past classes.
Abstract
Modern pre-trained architectures struggle to retain previous information while undergoing continuous fine-tuning on new tasks. Despite notable progress in continual classification, systems designed for complex vision tasks such as detection or segmentation still struggle to attain satisfactory performance. In this work, we introduce a memory-based detection transformer architecture to adapt a pre-trained DETR-style detector to new tasks while preserving knowledge from previous tasks. We propose a novel localized query function for efficient information retrieval from memory units, aiming to minimize forgetting. Furthermore, we identify a fundamental challenge in continual detection referred to as background relegation. This arises when object categories from earlier tasks reappear in future tasks, potentially without labels, leading them to be implicitly treated as background. This is an inevitable issue in continual detection or segmentation. The introduced continual optimization technique effectively tackles this challenge. Finally, we assess the performance of our proposed system on continual detection benchmarks and demonstrate that our approach surpasses the performance of existing state-of-the-art resulting in 5-7% improvements on MS-COCO and PASCAL-VOC on the task of continual detection.
