Table of Contents
Fetching ...

Animal Re-Identification on Microcontrollers

Yubo Chen, Di Zhao, Yun Sing Koh, Talia Xu

TL;DR

The paper tackles enabling animal re-identification directly on MCU-class devices by addressing severe memory and input-quality constraints. It shows that traditional knowledge distillation from large transformers yields limited gains under MCU conditions and instead demonstrates a structurally guided approach: carefully scaling and pruning a MobileNetV2 backbone for 64×64 inputs, with pretrained initialization and a knee-point depth that preserves performance while fitting MCU memory. The result is a tiny INT8 CNN (~84 KB) that delivers competitive retrieval accuracy on six public datasets and can run fully on a low-power Arduino, with a data-efficient fine-tuning strategy enabling rapid adaptation to new sites. This work paves the way for practical, scalable on-device Animal Re-ID deployments in field environments with limited connectivity.

Abstract

Camera-based animal re-identification (Animal Re-ID) can support wildlife monitoring and precision livestock management in large outdoor environments with limited wireless connectivity. In these settings, inference must run directly on collar tags or low-power edge nodes built around microcontrollers (MCUs), yet most Animal Re-ID models are designed for workstations or servers and are too large for devices with small memory and low-resolution inputs. We propose an on-device framework. First, we characterise the gap between state-of-the-art Animal Re-ID models and MCU-class hardware, showing that straightforward knowledge distillation from large teachers offers limited benefit once memory and input resolution are constrained. Second, guided by this analysis, we design a high-accuracy Animal Re-ID architecture by systematically scaling a CNN-based MobileNetV2 backbone for low-resolution inputs. Third, we evaluate the framework with a real-world dataset and introduce a data-efficient fine-tuning strategy to enable fast adaptation with just three images per animal identity at a new site. Across six public Animal Re-ID datasets, our compact model achieves competitive retrieval accuracy while reducing model size by over two orders of magnitude. On a self-collected cattle dataset, the deployed model performs fully on-device inference with only a small accuracy drop and unchanged Top-1 accuracy relative to its cluster version. We demonstrate that practical, adaptable Animal Re-ID is achievable on MCU-class devices, paving the way for scalable deployment in real field environments.

Animal Re-Identification on Microcontrollers

TL;DR

The paper tackles enabling animal re-identification directly on MCU-class devices by addressing severe memory and input-quality constraints. It shows that traditional knowledge distillation from large transformers yields limited gains under MCU conditions and instead demonstrates a structurally guided approach: carefully scaling and pruning a MobileNetV2 backbone for 64×64 inputs, with pretrained initialization and a knee-point depth that preserves performance while fitting MCU memory. The result is a tiny INT8 CNN (~84 KB) that delivers competitive retrieval accuracy on six public datasets and can run fully on a low-power Arduino, with a data-efficient fine-tuning strategy enabling rapid adaptation to new sites. This work paves the way for practical, scalable on-device Animal Re-ID deployments in field environments with limited connectivity.

Abstract

Camera-based animal re-identification (Animal Re-ID) can support wildlife monitoring and precision livestock management in large outdoor environments with limited wireless connectivity. In these settings, inference must run directly on collar tags or low-power edge nodes built around microcontrollers (MCUs), yet most Animal Re-ID models are designed for workstations or servers and are too large for devices with small memory and low-resolution inputs. We propose an on-device framework. First, we characterise the gap between state-of-the-art Animal Re-ID models and MCU-class hardware, showing that straightforward knowledge distillation from large teachers offers limited benefit once memory and input resolution are constrained. Second, guided by this analysis, we design a high-accuracy Animal Re-ID architecture by systematically scaling a CNN-based MobileNetV2 backbone for low-resolution inputs. Third, we evaluate the framework with a real-world dataset and introduce a data-efficient fine-tuning strategy to enable fast adaptation with just three images per animal identity at a new site. Across six public Animal Re-ID datasets, our compact model achieves competitive retrieval accuracy while reducing model size by over two orders of magnitude. On a self-collected cattle dataset, the deployed model performs fully on-device inference with only a small accuracy drop and unchanged Top-1 accuracy relative to its cluster version. We demonstrate that practical, adaptable Animal Re-ID is achievable on MCU-class devices, paving the way for scalable deployment in real field environments.

Paper Structure

This paper contains 23 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Comparison of Animal Re-Identification workflows. (a) Traditional cloud-based pipeline, where captured images are uploaded to a remote server for feature extraction and identification. (b) Our proposed MCU-based on-device pipeline, which performs end-to-end inference locally and outputs individual IDs without relying on cloud computation.
  • Figure 2: Comparison between high-resolution and low-resolution sample images in the ATRW dataset
  • Figure 3: Architecture of the MobileNetV2-based Network with Controllable Depth and Width.
  • Figure 4: mAP with the number of retained bottleneck blocks $N$ for depth--truncated MobileNetV2 backbones.
  • Figure 5: Cattle from different regions.