Exploring Deep Reinforcement Learning for Robust Target Tracking using Micro Aerial Vehicles
Alberto Dionigi, Mirko Leomanni, Alessandro Saviolo, Giuseppe Loianno, Gabriele Costante
TL;DR
This work tackles robust target tracking for micro aerial vehicles using a model-free deep reinforcement learning approach that operates in output-feedback mode with relative-position measurements. It employs an asymmetric actor-critic framework (SAC) where the policy (A-DNN) maps history-augmented observations to continuous thrust/attitude commands, while a privileged critic (C-DNN) uses additional information during training. Robustness is built into the learning process via domain randomization over mass and actuation delay, guided by a carefully designed reward that emphasizes tracking accuracy, smoothness, and collision avoidance. Compared to a model-based LQG baseline, the DRL controller demonstrates comparable nominal performance but superior resilience under significant uncertainties, validated through extensive simulations and vision-based rendering, indicating practical potential for real-world vision-based MAV tracking.
Abstract
The capability to autonomously track a non-cooperative target is a key technological requirement for micro aerial vehicles. In this paper, we propose an output feedback control scheme based on deep reinforcement learning for controlling a micro aerial vehicle to persistently track a flying target while maintaining visual contact. The proposed method leverages relative position data for control, relaxing the assumption of having access to full state information which is typical of related approaches in literature. Moreover, we exploit classical robustness indicators in the learning process through domain randomization to increase the robustness of the learned policy. Experimental results validate the proposed approach for target tracking, demonstrating high performance and robustness with respect to mass mismatches and control delays. The resulting nonlinear controller significantly outperforms a standard model-based design in numerous off-nominal scenarios.
