Table of Contents
Fetching ...

Latent Replay Detection: Memory-Efficient Continual Object Detection on Microcontrollers via Task-Adaptive Compression

Bibin Wilson

TL;DR

The first framework for continual object detection under MCU memory constraints is presented, and the task-adaptive FiLM compression and spatial diverse exemplar selection work synergistically to preserve detection capabilities.

Abstract

Deploying object detection on microcontrollers (MCUs) enables intelligent edge devices but current models cannot learn new object categories after deployment. Existing continual learning methods require storing raw images far exceeding MCU memory budgets of tens of kilobytes. We present Latent Replay Detection (LRD), the first framework for continual object detection under MCU memory constraints. Our key contributions are: 1. Task-Adaptive Compression: Unlike fixed PCA, we propose learnable compression with FiLM (Feature-wise Linear Modulation) conditioning, where task specific embeddings modulate the compression to preserve discriminative features for each task's distribution; 2. Spatial-Diverse Exemplar Selection: Traditional sampling ignores spatial information critical for detection - we select exemplars maximizing bounding box diversity via farthest-point sampling in IoU space, preventing localization bias in replay; 3. MCU-Deployable System: Our latent replay stores 150 bytes per sample versus >10KB for images, enabling a 64KB buffer to hold 400+ exemplars. Experiments on CORe50 (50 classes, 5 tasks) demonstrate that LRD achieves mAP@50 on the initial task and maintains strong performance across subsequent tasks - a significant improvement over naive fine-tuning while operating within strict MCU constraints. Our task-adaptive FiLM compression and spatial diverse exemplar selection work synergistically to preserve detection capabilities. Deployed on STM32H753ZI, ESP32-S3, and MAX78000 MCUs, LRD achieves 4.9-97.5ms latency per inference within a 64KB memory budget-enabling practical continual detection on edge devices for the first time.

Latent Replay Detection: Memory-Efficient Continual Object Detection on Microcontrollers via Task-Adaptive Compression

TL;DR

The first framework for continual object detection under MCU memory constraints is presented, and the task-adaptive FiLM compression and spatial diverse exemplar selection work synergistically to preserve detection capabilities.

Abstract

Deploying object detection on microcontrollers (MCUs) enables intelligent edge devices but current models cannot learn new object categories after deployment. Existing continual learning methods require storing raw images far exceeding MCU memory budgets of tens of kilobytes. We present Latent Replay Detection (LRD), the first framework for continual object detection under MCU memory constraints. Our key contributions are: 1. Task-Adaptive Compression: Unlike fixed PCA, we propose learnable compression with FiLM (Feature-wise Linear Modulation) conditioning, where task specific embeddings modulate the compression to preserve discriminative features for each task's distribution; 2. Spatial-Diverse Exemplar Selection: Traditional sampling ignores spatial information critical for detection - we select exemplars maximizing bounding box diversity via farthest-point sampling in IoU space, preventing localization bias in replay; 3. MCU-Deployable System: Our latent replay stores 150 bytes per sample versus >10KB for images, enabling a 64KB buffer to hold 400+ exemplars. Experiments on CORe50 (50 classes, 5 tasks) demonstrate that LRD achieves mAP@50 on the initial task and maintains strong performance across subsequent tasks - a significant improvement over naive fine-tuning while operating within strict MCU constraints. Our task-adaptive FiLM compression and spatial diverse exemplar selection work synergistically to preserve detection capabilities. Deployed on STM32H753ZI, ESP32-S3, and MAX78000 MCUs, LRD achieves 4.9-97.5ms latency per inference within a 64KB memory budget-enabling practical continual detection on edge devices for the first time.
Paper Structure (68 sections, 3 theorems, 19 equations, 3 figures, 9 tables, 1 algorithm)

This paper contains 68 sections, 3 theorems, 19 equations, 3 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

The expected forgetting after $T$ tasks is bounded by: where $M$ is replay memory size and $\eta$ is learning rate.

Figures (3)

  • Figure 1: LRD enables continual object detection on MCUs. Our method stores compressed latent features instead of raw images, fitting a replay buffer within 64KB while supporting incremental learning of new object categories.
  • Figure 2: Task progression. LRD maintains performance on Task 1 while learning subsequent tasks. Fine-tuning catastrophically forgets.
  • Figure 3: Qualitative results. LRD retains detection capability on objects from early tasks, correctly detecting objects from Tasks 1-5 after all training. Fine-tuning exhibits catastrophic forgetting, missing objects from early tasks while detecting only recent ones.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Theorem 3