Table of Contents
Fetching ...

Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots

Jiaze Cai, Vishnu Sangli, Mintae Kim, Koushil Sreenath

TL;DR

A model-free reinforcement learning (RL)-based framework for a high degree-of-freedom (DoF) bird-inspired flapping-wing robot that allows for multimodal flight and agile trajectory tracking is proposed.

Abstract

Bird-sized flapping-wing robots offer significant potential for agile flight in complex environments, but achieving agile and robust trajectory tracking remains a challenge due to the complex aerodynamics and highly nonlinear dynamics inherent in flapping-wing flight. In this work, a learning-based control approach is introduced to unlock the versatility and adaptiveness of flapping-wing flight. We propose a model-free reinforcement learning (RL)-based framework for a high degree-of-freedom (DoF) bird-inspired flapping-wing robot that allows for multimodal flight and agile trajectory tracking. Stability analysis was performed on the closed-loop system comprising of the flapping-wing system and the RL policy. Additionally, simulation results demonstrate that the RL-based controller can successfully learn complex wing trajectory patterns, achieve stable flight, switch between flight modes spontaneously, and track different trajectories under various aerodynamic conditions.

Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots

TL;DR

A model-free reinforcement learning (RL)-based framework for a high degree-of-freedom (DoF) bird-inspired flapping-wing robot that allows for multimodal flight and agile trajectory tracking is proposed.

Abstract

Bird-sized flapping-wing robots offer significant potential for agile flight in complex environments, but achieving agile and robust trajectory tracking remains a challenge due to the complex aerodynamics and highly nonlinear dynamics inherent in flapping-wing flight. In this work, a learning-based control approach is introduced to unlock the versatility and adaptiveness of flapping-wing flight. We propose a model-free reinforcement learning (RL)-based framework for a high degree-of-freedom (DoF) bird-inspired flapping-wing robot that allows for multimodal flight and agile trajectory tracking. Stability analysis was performed on the closed-loop system comprising of the flapping-wing system and the RL policy. Additionally, simulation results demonstrate that the RL-based controller can successfully learn complex wing trajectory patterns, achieve stable flight, switch between flight modes spontaneously, and track different trajectories under various aerodynamic conditions.

Paper Structure

This paper contains 22 sections, 7 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 2: The layout of the flapping-wing robot. In simulation, the flapping wing robot is modeled as 4 rigid ellipsoid bodies (yellow ellipses) with 5 joints (blue cylinders). The bold red, green, and blue arrows represent the local xyz-frame of the vehicle.
  • Figure 3: The flapping-wing robot follows a loop trajectory generated from simulation. The robot performs an Immelmann turn (pitch up and roll back level), a half-loop maneuver (pitch down and roll back level), and a rejoin to the trajectory. The dark green points represent the reference points of the trajectory over time.
  • Figure 4: The control diagram for flapping-wing robot trajectory tracking control. The variables used here are covered in Sec \ref{['sec:StateAct_Spaces']}
  • Figure 5: Forward flapping behavior of the controller shown through time series of the wing pitch angle, wing pitch angle, and tail angle in 1 second. The dashed lines are target joint positions while the solid lines indicate the actual joint positions.
  • Figure 6: System identification is performed on the closed-loop system. A low-dimensional system is derived from the high-dimensional, nonlinear dynamics of the flapping-wing robot, which is controlled by its RL policy. The input, $\mathbf{u}$, of this simplified system is the desired global position, while the output, $\mathbf{y}$, represents the robot’s measured response, driven by the low-level RL policy.
  • ...and 8 more figures