Learning Agile and Robust Omnidirectional Aerial Motion on Overactuated Tiltable-Quadrotors

Wentao Zhang; Zhaoqi Ma; Jinjie Li; Huayi Wang; Haokun Liu; Junichiro Sugihara; Chen Chen; Yicheng Chen; Moju Zhao

Learning Agile and Robust Omnidirectional Aerial Motion on Overactuated Tiltable-Quadrotors

Wentao Zhang, Zhaoqi Ma, Jinjie Li, Huayi Wang, Haokun Liu, Junichiro Sugihara, Chen Chen, Yicheng Chen, Moju Zhao

TL;DR

This work presents a learning-based control framework that enables efficient acquisition of coordinated rotor-joint behaviors for reaching target poses in the $SE(3)$ space and achieves comparable six-degree-of-freedom pose tracking accuracy, while demonstrating superior robustness and generalization across diverse tasks.

Abstract

Tilt-rotor aerial robots enable omnidirectional maneuvering through thrust vectoring, but introduce significant control challenges due to the strong coupling between joint and rotor dynamics. While model-based controllers can achieve high motion accuracy under nominal conditions, their robustness and responsiveness often degrade in the presence of disturbances and modeling uncertainties. This work investigates reinforcement learning for omnidirectional aerial motion control on over-actuated tiltable quadrotors that prioritizes robustness and agility. We present a learning-based control framework that enables efficient acquisition of coordinated rotor-joint behaviors for reaching target poses in the $SE(3)$ space. To achieve reliable sim-to-real transfer while preserving motion accuracy, we integrate system identification with minimal and physically consistent domain randomization. Compared with a state-of-the-art NMPC controller, the proposed method achieves comparable six-degree-of-freedom pose tracking accuracy, while demonstrating superior robustness and generalization across diverse tasks, enabling zero-shot deployment on real hardware.

Learning Agile and Robust Omnidirectional Aerial Motion on Overactuated Tiltable-Quadrotors

TL;DR

This work presents a learning-based control framework that enables efficient acquisition of coordinated rotor-joint behaviors for reaching target poses in the

space and achieves comparable six-degree-of-freedom pose tracking accuracy, while demonstrating superior robustness and generalization across diverse tasks.

Abstract

space. To achieve reliable sim-to-real transfer while preserving motion accuracy, we integrate system identification with minimal and physically consistent domain randomization. Compared with a state-of-the-art NMPC controller, the proposed method achieves comparable six-degree-of-freedom pose tracking accuracy, while demonstrating superior robustness and generalization across diverse tasks, enabling zero-shot deployment on real hardware.

Paper Structure (34 sections, 12 equations, 9 figures, 7 tables)

This paper contains 34 sections, 12 equations, 9 figures, 7 tables.

Introduction
Related Work
Reinforcement Learning for Robot Control
Motion Control for Tiltable Aerial Robot
Method
Preliminaries
Platform Introduction
Task Definition
Problem Formulation
Training Scheme
Observation and Action Space
Reward Function
Initial State Distribution
Sim-to-Real Transfer
System Latency
...and 19 more sections

Figures (9)

Figure 1: Overall RL framework for system identification, training and deployment. Robot: Joint modules (red, target thrust ${\boldsymbol{f}}_{\mathrm{r}}^{*} \rightarrow$ real thrust $\boldsymbol{f}_{\mathrm{r}}$ and torque $\boldsymbol{\tau}_{\mathrm{r}}$), rotor modules (blue, joint position commands $\boldsymbol{q}_{\mathrm{j}}^* \rightarrow$ position feedback $\boldsymbol{q}_{\mathrm{j}}$), and system latency (green) are modeled to construct physically consistent simulation environments. Training: Policy is trained under asymmetric actor-critic framework and dynamics randomization (green) is integrated according to models and deployment scheme. Deployment: Trained policy is deployed on real robot platform with consistent system architecture. The robot's state is estimated by fusing data from a motion capture system (MoCap) and an onboard IMU using an extended Kalman filter (EKF).
Figure 2: Statistical distributions of position and orientation errors during stable hovering at the sampled target poses.
Figure 3: Disturbance rejection evaluation results under time-varying external forces and torques. The disturbances are updated every 2s, and the black dashed lines denote their magnitudes.
Figure 4: State trajectories from real-world continuous pose-reaching experiments. Solid lines show the mean values across trials, and the shaded regions represent the corresponding minimum and maximum ranges.
Figure 5: Real-world disturbance rejection evaluation. The former part shows the response under wind disturbances, while the later part presents the response to impulsive stick pushes.
...and 4 more figures

Learning Agile and Robust Omnidirectional Aerial Motion on Overactuated Tiltable-Quadrotors

TL;DR

Abstract

Learning Agile and Robust Omnidirectional Aerial Motion on Overactuated Tiltable-Quadrotors

Authors

TL;DR

Abstract

Table of Contents

Figures (9)