Ultra-low Power Deep Learning-based Monocular Relative Localization Onboard Nano-quadrotors
Stefano Bonato, Stefano Carlo Lambertenghi, Elia Cereda, Alessandro Giusti, Daniele Palossi
TL;DR
This work tackles monocular relative pose estimation between ultra-light nano-drones under stringent power and compute constraints. It introduces a vertically integrated, onboard pipeline and compares a lightweight PULP-Frontnet CNN against MobileNetV2 variants, achieving robust field performance with mean field errors around tens of centimeters and sustained tracking for up to 2 minutes at 48 Hz on ~95 mW. The approach emphasizes end-to-end deployment, including dataset collection, quantization with PACT, and optimized C-code generation (DORY) for on-board inference, followed by Kalman-filtered control. The results demonstrate practical viability for swarm operations, enabling precise, low-power, monocular relative localization on nano-drones and paving the way for scalable multi-agent aerial systems.
Abstract
Precise relative localization is a crucial functional block for swarm robotics. This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones, i.e., sub-40g of weight and sub-100mW processing power. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, from the dataset collection to the final in-field deployment, including dataset augmentation, quantization, and system optimizations. Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to ~2m distance. On a disjoint testing dataset our model yields a mean R2 score of 0.42 and a root mean square error of 18cm, which results in a mean in-field prediction error of 15cm and in a closed-loop control error of 17cm, over a ~60s-flight test. Ultimately, the proposed system improves the State-of-the-Art by showing long-endurance tracking performance (up to 2min continuous tracking), generalization capabilities being deployed in a never-seen-before environment, and requiring a minimal power consumption of 95mW for an onboard real-time inference-rate of 48Hz.
