Fusing Multi-sensor Input with State Information on TinyML Brains for Autonomous Nano-drones
Luca Crupi, Elia Cereda, Daniele Palossi
TL;DR
This work tackles sense-and-act limitations of ultra-low-power TinyML nano-drones by embedding drone state information into a lightweight CNN for allocentric human pose estimation using multi-sensor input (grayscale $160\times96$ images and an $8\times8$ depth map) to predict $x$ and $y$. It extends a SoA baseline CNN with four state-aware fusion schemes (input, mid, late direct, and late with MLP) that incorporate attitude angles ($\varphi$, $\theta$) represented as either 2D state maps or a 2-element vector, and trains entirely in simulation with domain randomization before evaluating on a real-world $\sim$3.5k sample set. The ablation study shows consistent $R^2$ gains when using state information, with the best late-fusion direct configuration delivering up to $+0.10$ on $x$ and $+0.01$ on $y$, with negligible MAC and memory overhead (around 0.11% and minimal changes). Overall, the paper demonstrates a practical path to enhancing allocentric perception on TinyML platforms, enabling more capable autonomous nano-drones without significant energy or compute penalties, validated across diverse physical and simulated scenarios.
Abstract
Autonomous nano-drones (~10 cm in diameter), thanks to their ultra-low power TinyML-based brains, are capable of coping with real-world environments. However, due to their simplified sensors and compute units, they are still far from the sense-and-act capabilities shown in their bigger counterparts. This system paper presents a novel deep learning-based pipeline that fuses multi-sensorial input (i.e., low-resolution images and 8x8 depth map) with the robot's state information to tackle a human pose estimation task. Thanks to our design, the proposed system -- trained in simulation and tested on a real-world dataset -- improves a state-unaware State-of-the-Art baseline by increasing the R^2 regression metric up to 0.10 on the distance's prediction.
