Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGA
Michal Danilowicz, Tomasz Kryjak
TL;DR
The paper addresses real-time multi-object tracking on energy-constrained embedded platforms by presenting a heterogeneous SoC FPGA design that couples a quantized YOLOv8_nano detector implemented in PL with a SORT tracker executed in the PS. The detector uses Quantization-Aware Training with 4-bit weights/activations via the FINN framework and accesses external memory to store parameters, enabling efficient hardware acceleration. The integrated system achieves high detector throughput (≈195.3 fps) and competitive tracking performance (MOTA ≈ 38.9) on MOT15 while evaluating on COCO for detection quality (mAP ≈ 0.21). This work demonstrates the practical feasibility of embedded MOT on MPSoC FPGAs and outlines trade-offs and future enhancements for even more capable energy-efficient perception in mobile robotics and autonomous systems.
Abstract
Multi-object tracking (MOT) is one of the most important problems in computer vision and a key component of any vision-based perception system used in advanced autonomous mobile robotics. Therefore, its implementation on low-power and real-time embedded platforms is highly desirable. Modern MOT algorithms should be able to track objects of a given class (e.g. people or vehicles). In addition, the number of objects to be tracked is not known in advance, and they may appear and disappear at any time, as well as be obscured. For these reasons, the most popular and successful approaches have recently been based on the tracking paradigm. Therefore, the presence of a high quality object detector is essential, which in practice accounts for the vast majority of the computational and memory complexity of the whole MOT system. In this paper, we propose an FPGA (Field-Programmable Gate Array) implementation of an embedded MOT system based on a quantized YOLOv8 detector and the SORT (Simple Online Realtime Tracker) tracker. We use a modified version of the FINN framework to utilize external memory for model parameters and to support operations necessary required by YOLOv8. We discuss the evaluation of detection and tracking performance using the COCO and MOT15 datasets, where we achieve 0.21 mAP and 38.9 MOTA respectively. As the computational platform, we use an MPSoC system (Zynq UltraScale+ device from AMD/Xilinx) where the detector is deployed in reprogrammable logic and the tracking algorithm is implemented in the processor system.
