Table of Contents
Fetching ...

Efficient Multi-Object Tracking on Edge Devices via Reconstruction-Based Channel Pruning

Jan Müller, Adrian Pigors

TL;DR

This work tackles efficient multi-object tracking on edge devices by pruning Joint Detection and Embedding models with a reconstruction-based, group-aware strategy. It introduces gated groups via DepGraph to selectively prune interdependent layers, enabling global iterative pruning that achieves up to 70% parameter reduction while preserving tracking accuracy on MOT20. The method improves edge deployment feasibility on devices like the Jetson Orin Nano, reducing memory and compute requirements without sacrificing critical MOT metrics such as MOTA, IDF1, and HOTA. The practical impact lies in privacy-preserving, real-time MOT at the network edge, suitable for smart cameras and privacy-conscious deployments.

Abstract

The advancement of multi-object tracking (MOT) technologies presents the dual challenge of maintaining high performance while addressing critical security and privacy concerns. In applications such as pedestrian tracking, where sensitive personal data is involved, the potential for privacy violations and data misuse becomes a significant issue if data is transmitted to external servers. To mitigate these risks, processing data directly on an edge device, such as a smart camera, has emerged as a viable solution. Edge computing ensures that sensitive information remains local, thereby aligning with stringent privacy principles and significantly reducing network latency. However, the implementation of MOT on edge devices is not without its challenges. Edge devices typically possess limited computational resources, necessitating the development of highly optimized algorithms capable of delivering real-time performance under these constraints. The disparity between the computational requirements of state-of-the-art MOT algorithms and the capabilities of edge devices emphasizes a significant obstacle. To address these challenges, we propose a neural network pruning method specifically tailored to compress complex networks, such as those used in modern MOT systems. This approach optimizes MOT performance by ensuring high accuracy and efficiency within the constraints of limited edge devices, such as NVIDIA's Jetson Orin Nano. By applying our pruning method, we achieve model size reductions of up to 70% while maintaining a high level of accuracy and further improving performance on the Jetson Orin Nano, demonstrating the effectiveness of our approach for edge computing applications.

Efficient Multi-Object Tracking on Edge Devices via Reconstruction-Based Channel Pruning

TL;DR

This work tackles efficient multi-object tracking on edge devices by pruning Joint Detection and Embedding models with a reconstruction-based, group-aware strategy. It introduces gated groups via DepGraph to selectively prune interdependent layers, enabling global iterative pruning that achieves up to 70% parameter reduction while preserving tracking accuracy on MOT20. The method improves edge deployment feasibility on devices like the Jetson Orin Nano, reducing memory and compute requirements without sacrificing critical MOT metrics such as MOTA, IDF1, and HOTA. The practical impact lies in privacy-preserving, real-time MOT at the network edge, suitable for smart cameras and privacy-conscious deployments.

Abstract

The advancement of multi-object tracking (MOT) technologies presents the dual challenge of maintaining high performance while addressing critical security and privacy concerns. In applications such as pedestrian tracking, where sensitive personal data is involved, the potential for privacy violations and data misuse becomes a significant issue if data is transmitted to external servers. To mitigate these risks, processing data directly on an edge device, such as a smart camera, has emerged as a viable solution. Edge computing ensures that sensitive information remains local, thereby aligning with stringent privacy principles and significantly reducing network latency. However, the implementation of MOT on edge devices is not without its challenges. Edge devices typically possess limited computational resources, necessitating the development of highly optimized algorithms capable of delivering real-time performance under these constraints. The disparity between the computational requirements of state-of-the-art MOT algorithms and the capabilities of edge devices emphasizes a significant obstacle. To address these challenges, we propose a neural network pruning method specifically tailored to compress complex networks, such as those used in modern MOT systems. This approach optimizes MOT performance by ensuring high accuracy and efficiency within the constraints of limited edge devices, such as NVIDIA's Jetson Orin Nano. By applying our pruning method, we achieve model size reductions of up to 70% while maintaining a high level of accuracy and further improving performance on the Jetson Orin Nano, demonstrating the effectiveness of our approach for edge computing applications.

Paper Structure

This paper contains 11 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Overview of the FairMOT architecture. The model integrates object detection and re-identification (Re-ID) using CenterNet zhou2019objects with the DLA-34 yu2018deep backbone. It produces outputs for object center and class, object size, offset (correcting for quantization at a downsampled resolution of $\frac{H}{4} \times \frac{W}{4}$, where $H$ and $W$ are the input image height and width), and Re-ID embedding. The numbers in the boxes refer to the downsampling factors relative to the original image resolution. (Based on yu2018deepzhou2019objectszhang2021fairmot.)
  • Figure 2: Illustration of a gated group with the convolutional layer $\ell_4^-$ as the pruning target, where $\ell_*^-$ denotes input channels and $\ell_*^+$ output channels for layer $\ell_*$. In this example, $\ell_8$ is the only element within the gate set. The dashed red arrows highlight the propagation of dependencies originating from target $\ell_4^-$. (Adapted from fang2023depgraph.)