An Integrated Imitation and Reinforcement Learning Methodology for Robust Agile Aircraft Control with Limited Pilot Demonstration Data
Gulay Goktas Sever, Umut Demir, Abdullah Sadik Satir, Mustafa Cagatay Sahin, Nazim Kemal Ure
TL;DR
The work addresses robust agile aircraft maneuver generation with limited pilot data by leveraging a source-model to generate unlimited data and an integrated IL-TL-RL pipeline. It starts with imitation learning via Behavior Cloning, enhanced by Confidence-DAgger for robustness, then applies transfer learning to adapt to a target aircraft with minimal data, and finally employs additive reinforcement learning (TD3) to adapt to updated dynamics, where the final command is $A_{TL+RL} = A_{TL} + C_{RL} A_{RL}$ and $C_{RL}\in(0,1]$. The approach is validated using real pilot data from Turkish Aerospace Industries and an open-source F-16 as the source, achieving cross-trim and cross-aircraft generalization with few target demonstrations and rapid RL fine-tuning (1–2 hours). The results demonstrate a data-efficient, robust, transferable framework for agile maneuver generation that reduces pilot data requirements and accelerates validation of prototypes.
Abstract
In this paper, we present a methodology for constructing data-driven maneuver generation models for agile aircraft that can generalize across a wide range of trim conditions and aircraft model parameters. Maneuver generation models play a crucial role in the testing and evaluation of aircraft prototypes, providing insights into the maneuverability and agility of the aircraft. However, constructing the models typically requires extensive amounts of real pilot data, which can be time-consuming and costly to obtain. Moreover, models built with limited data often struggle to generalize beyond the specific flight conditions covered in the original dataset. To address these challenges, we propose a hybrid architecture that leverages a simulation model, referred to as the source model. This open-source agile aircraft simulator shares similar dynamics with the target aircraft and allows us to generate unlimited data for building a proxy maneuver generation model. We then fine-tune this model to the target aircraft using a limited amount of real pilot data. Our approach combines techniques from imitation learning, transfer learning, and reinforcement learning to achieve this objective. To validate our methodology, we utilize real agile pilot data provided by Turkish Aerospace Industries (TAI). By employing the F-16 as the source model, we demonstrate that it is possible to construct a maneuver generation model that generalizes across various trim conditions and aircraft parameters without requiring any additional real pilot data. Our results showcase the effectiveness of our approach in developing robust and adaptable models for agile aircraft.
