Latent Distillation for Continual Object Detection at the Edge
Francesco Pasti, Marina Ceccon, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto, Nicola Bellotto
TL;DR
The paper tackles distribution shifts in object detection on resource-constrained edge devices by evaluating a lightweight detector (NanoDet) for continual learning and introducing Latent Distillation (LD) to reduce update cost. LD shares lower, frozen layers between teacher and student, propagating latent representations to upper layers and applying distillation only to old classes, thereby cutting memory and FLOPs. Across VOC and COCO benchmarks, LD achieves competitive accuracy with substantial overhead reductions (74% fewer distillation parameters and 56% fewer FLOPs) compared to traditional distillation methods, while SID often provides the strongest stability for multi-class tasks. The work demonstrates the practicality of edge-friendly continual learning for one-stage detectors and offers a path toward real-time edge adaptation in dynamic environments.
Abstract
While numerous methods achieving remarkable performance exist in the Object Detection literature, addressing data distribution shifts remains challenging. Continual Learning (CL) offers solutions to this issue, enabling models to adapt to new data while maintaining performance on previous data. This is particularly pertinent for edge devices, common in dynamic environments like automotive and robotics. In this work, we address the memory and computation constraints of edge devices in the Continual Learning for Object Detection (CLOD) scenario. Specifically, (i) we investigate the suitability of an open-source, lightweight, and fast detector, namely NanoDet, for CLOD on edge devices, improving upon larger architectures used in the literature. Moreover, (ii) we propose a novel CL method, called Latent Distillation~(LD), that reduces the number of operations and the memory required by state-of-the-art CL approaches without significantly compromising detection performance. Our approach is validated using the well-known VOC and COCO benchmarks, reducing the distillation parameter overhead by 74\% and the Floating Points Operations~(FLOPs) by 56\% per model update compared to other distillation methods.
