Table of Contents
Fetching ...

Safe Reinforcement Learning Filter for Multicopter Collision-Free Tracking under disturbances

Qihan Qi, Xinsong Yang, Gang Xia

Abstract

This paper proposes a safe reinforcement learning filter (SRLF) to realize multicopter collision-free trajectory tracking with input disturbance. A novel robust control barrier function (RCBF) with its analysis techniques is introduced to avoid collisions with unknown disturbances during tracking. To ensure the system state remains within the safe set, the RCBF gain is designed in control action. A safety filter is introduced to transform unsafe reinforcement learning (RL) control inputs into safe ones, allowing RL training to proceed without explicitly considering safety constraints. The SRLF obtains rigorous guaranteed safe control action by solving a quadratic programming (QP) problem that incorporates forward invariance of RCBF and input saturation constraints. Both simulation and real-world experiments on multicopters demonstrate the effectiveness and excellent performance of SRLF in achieving collision-free tracking under input disturbances and saturation.

Safe Reinforcement Learning Filter for Multicopter Collision-Free Tracking under disturbances

Abstract

This paper proposes a safe reinforcement learning filter (SRLF) to realize multicopter collision-free trajectory tracking with input disturbance. A novel robust control barrier function (RCBF) with its analysis techniques is introduced to avoid collisions with unknown disturbances during tracking. To ensure the system state remains within the safe set, the RCBF gain is designed in control action. A safety filter is introduced to transform unsafe reinforcement learning (RL) control inputs into safe ones, allowing RL training to proceed without explicitly considering safety constraints. The SRLF obtains rigorous guaranteed safe control action by solving a quadratic programming (QP) problem that incorporates forward invariance of RCBF and input saturation constraints. Both simulation and real-world experiments on multicopters demonstrate the effectiveness and excellent performance of SRLF in achieving collision-free tracking under input disturbances and saturation.

Paper Structure

This paper contains 11 sections, 3 theorems, 25 equations, 13 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Consider the system 2.2 with disturbance $\boldsymbol{\mu}$, letting $b(\cdot): \mathbb{R}^{n}\rightarrow\mathbb{R}$ be a smooth function defined on $\mathcal{D}$. If $b(\cdot)$ is a RCBF with $\mathcal{C}\subset\mathcal{C}_{d}\subset\mathcal{D}$, and the set $\mathcal{C}_{d}$ satisfies $\iota(\cdot

Figures (13)

  • Figure 1: The framework of SRLF.
  • Figure 2: The average return training curves of SPPOF, SSACF, PPO-Lag, SAC-Lag, SPPOF_l and SSACF_l by running 5 times with different seed. The lines and shaded areas represent the average return and the 95% confidence interval, respectively.
  • Figure 3: The average cost training curves of SPPOF_l , PPO-Lag and SSACF_l by running 5 times with different seed. The cost value is the number of collisions. The lines and shaded areas represent the average cost and the 95% confidence interval, respectively.
  • Figure 4: Collision-free figure-8 tracking of SPPOF, SSACF and PPO-Lag, the solid line represents the trajectory without disturbances, the dashed line represents the trajectory under disturbances, and the gray spheres are obstacles.
  • Figure 5: Collision-free figure-8 tracking of SPPOF_l, SSACF_l and PPO-Lag, the solid line represents the trajectory without disturbances, the dashed line represents the trajectory under disturbances, and the gray spheres are obstacles.
  • ...and 8 more figures

Theorems & Definitions (8)

  • Definition 1
  • Definition 2
  • Lemma 1: kolathaya2018input
  • Remark 1
  • Theorem 1
  • Remark 2
  • Remark 3
  • Theorem 2