Multitask Reinforcement Learning for Quadcopter Attitude Stabilization and Tracking using Graph Policy
Yu Tang Liu, Afonso Vale, Aamir Ahmad, Rodrigo Ventura, Meysam Basiri
TL;DR
This work tackles quadcopter attitude control for both tracking and aggressive stabilization by introducing a Graph-Convolutional-Network policy trained with multitask Soft Actor-Critic in IsaacGym. The approach leverages explicit domain priors through a learnable graph to fuse state, task, and environment information, achieving faster learning and higher sample efficiency, and enabling a compact onboard controller that runs at 400 Hz on a Pixhawk. Sim-to-real transfer is facilitated by RMA-based domain adaptation and a lightweight adaptor, with real-world tests demonstrating robust tracking and stabilization, including recovery from free-fall. While the method delivers strong onboard performance, it faces challenges in fully decoupling task components and may benefit from larger graph architectures or compression techniques for further gains.
Abstract
Quadcopter attitude control involves two tasks: smooth attitude tracking and aggressive stabilization from arbitrary states. Although both can be formulated as tracking problems, their distinct state spaces and control strategies complicate a unified reward function. We propose a multitask deep reinforcement learning framework that leverages parallel simulation with IsaacGym and a Graph Convolutional Network (GCN) policy to address both tasks effectively. Our multitask Soft Actor-Critic (SAC) approach achieves faster, more reliable learning and higher sample efficiency than single-task methods. We validate its real-world applicability by deploying the learned policy - a compact two-layer network with 24 neurons per layer - on a Pixhawk flight controller, achieving 400 Hz control without extra computational resources. We provide our code at https://github.com/robot-perception-group/GraphMTSAC\_UAV/.
