Table of Contents
Fetching ...

ASTER: Attitude-aware Suspended-payload Quadrotor Traversal via Efficient Reinforcement Learning

Dongcheng Cao, Jin Zhou, Shuo Li

TL;DR

ASTER is presented, a robust RL framework that achieves, to the authors' knowledge, the first successful inverted flight for the cable-suspended system and proposes hybrid-dynamics-informed state seeding (HDSS), an initialization strategy that back-propagates target configurations through physics-consistent kinematic inversions across both taut and slack cable phases.

Abstract

Agile maneuvering of the quadrotor cable-suspended system is significantly hindered by its non-smooth hybrid dynamics. While model-free Reinforcement Learning (RL) circumvents explicit differentiation of complex models, achieving attitude-constrained or inverted flight remains an open challenge due to the extreme reward sparsity under strict orientation requirements. This paper presents ASTER, a robust RL framework that achieves, to our knowledge, the first successful autonomous inverted flight for the cable-suspended system. We propose hybrid-dynamics-informed state seeding (HDSS), an initialization strategy that back-propagates target configurations through physics-consistent kinematic inversions across both taut and slack cable phases. HDSS enables the policy to discover aggressive maneuvers that are unreachable via standard exploration. Extensive simulations and real-world experiments demonstrate remarkable agility, precise attitude alignment, and robust zero-shot sim-to-real transfer across complex trajectories.

ASTER: Attitude-aware Suspended-payload Quadrotor Traversal via Efficient Reinforcement Learning

TL;DR

ASTER is presented, a robust RL framework that achieves, to the authors' knowledge, the first successful inverted flight for the cable-suspended system and proposes hybrid-dynamics-informed state seeding (HDSS), an initialization strategy that back-propagates target configurations through physics-consistent kinematic inversions across both taut and slack cable phases.

Abstract

Agile maneuvering of the quadrotor cable-suspended system is significantly hindered by its non-smooth hybrid dynamics. While model-free Reinforcement Learning (RL) circumvents explicit differentiation of complex models, achieving attitude-constrained or inverted flight remains an open challenge due to the extreme reward sparsity under strict orientation requirements. This paper presents ASTER, a robust RL framework that achieves, to our knowledge, the first successful autonomous inverted flight for the cable-suspended system. We propose hybrid-dynamics-informed state seeding (HDSS), an initialization strategy that back-propagates target configurations through physics-consistent kinematic inversions across both taut and slack cable phases. HDSS enables the policy to discover aggressive maneuvers that are unreachable via standard exploration. Extensive simulations and real-world experiments demonstrate remarkable agility, precise attitude alignment, and robust zero-shot sim-to-real transfer across complex trajectories.
Paper Structure (25 sections, 16 equations, 8 figures, 4 tables)

This paper contains 25 sections, 16 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Composite time-lapse illustration of the real-world inverted flight maneuvers, showcasing the continuous motion and attitude transitions.
  • Figure 2: Overview of the training pipeline of the proposed framework. We highlight the design of hybrid-dynamics-informed state seeding (HDSS), a critical strategy for state initialization to overcome exploration bottlenecks in reward-sparse tasks. HDSS back-propagates target configurations through hybrid phases to provide physics-consistent initializations. This strategy balances efficient maneuver discovery with global robustness, enabling the policy to master complex flight dynamics.
  • Figure 3: Schematic of the reference frames and payload state: including the world frame $\bm{W}$, the quadrotor body frame $\bm{B}$, and the target frame $\bm{T}$ (illustrated for a waypoint with a downward-pointing $Z$-axis). The vector $\mathbf{x}_l^{\bm{B}}$ denotes the payload position described in the quadrotor body frame.
  • Figure 4: Agile trajectory visualization on 3 representative tracks. The policy is evaluated on (a) Ribbon, (b) Croissant, and (c) Multi-heading tracks. The quadrotor's velocity is visualized via heatmaps, with payload trajectories and system keyframes overlaid. The high-speed, attitude-constrained navigation across all tracks validates the robustness and agility of the learned policy. For visual clarity, only inverted waypoints are denoted by circular markers with embedded arrows indicating the target z-axis direction, while standard upright waypoints are hidden.
  • Figure 5: Time-lapse visualization of agile flight on the Croissant track in the Genesis simulator. The snapshot sequence demonstrates the policy's capability to execute consecutive inverted maneuvers while maintaining precise attitude alignment and system stability.
  • ...and 3 more figures