Learning Hybrid Policies for MPC with Application to Drone Flight in Unknown Dynamic Environments
Zhaohan Feng, Jie Chen, Wei Xiao, Jian Sun, Bin Xin, Gang Wang
TL;DR
This work addresses autonomous drone traversal through swinging gates with unknown dynamics by proposing hyMPC, a hybrid control framework that blends parameterized MPC with learning-based high-level decisions. A high-level Gaussian policy determines a mix between two MPC subtasks—gate-following and gate-traversing—while an online model predicts gate motion to supply real-time references; policy search is performed episodically, and deep neural nets are trained offline to emit preferred traversal timing and mixing weights. The approach is validated in simulations, showing hyMPC achieves near-perfect success and tighter traversal errors compared to baselines across varying initial distances and under thrust perturbations, including multi-gate scenarios. The findings suggest hyMPC provides robust, data-efficient adaptation to unknown environmental dynamics with practical implications for real-world drone operations in dynamic environments.
Abstract
In recent years, drones have found increased applications in a wide array of real-world tasks. Model predictive control (MPC) has emerged as a practical method for drone flight control, owing to its robustness against modeling errors/uncertainties and external disturbances. However, MPC's sensitivity to manually tuned parameters can lead to rapid performance degradation when faced with unknown environmental dynamics. This paper addresses the challenge of controlling a drone as it traverses a swinging gate characterized by unknown dynamics. This paper introduces a parameterized MPC approach named hyMPC that leverages high-level decision variables to adapt to uncertain environmental conditions. To derive these decision variables, a novel policy search framework aimed at training a high-level Gaussian policy is presented. Subsequently, we harness the power of neural network policies, trained on data gathered through the repeated execution of the Gaussian policy, to provide real-time decision variables. The effectiveness of hyMPC is validated through numerical simulations, achieving a 100\% success rate in 20 drone flight tests traversing a swinging gate, demonstrating its capability to achieve safe and precise flight with limited prior knowledge of environmental dynamics.
