Table of Contents
Fetching ...

Reinforcement Learning Based Prediction of PID Controller Gains for Quadrotor UAVs

Serhat Sönmez, Luca Montecchio, Simone Martini, Matthew J. Rutherford, Alessandro Rizzo, Margareta Stefanovic, Kimon P. Valavanis

TL;DR

This work tackles the challenge of achieving accurate quadrotor trajectory tracking by automatically tuning the inner-loop PD gains via reinforcement learning. A DDPG-based agent is trained offline in MATLAB/Simulink to adjust five normalized gain weights, using a piecewise attitude-error reward, and is validated through numerical simulations, Hardware-In-The-Loop testing, and outdoor flights. The study demonstrates that RL-tuned gains reduce attitude errors and overshoot compared with hand-tuned gains, and the gains can adapt online to disturbances, despite training without some physical effects. The results highlight the potential of RL-based fine-tuning to bridge simulation and real-world UAV control, with practical implications for robust, adaptable autonomous flight and directions for future enhancements such as disturbance-aware training and GPS-robust positioning.

Abstract

A reinforcement learning (RL) based methodology is proposed and implemented for online fine-tuning of PID controller gains, thus, improving quadrotor effective and accurate trajectory tracking. The RL agent is first trained offline on a quadrotor PID attitude controller and then validated through simulations and experimental flights. RL exploits a Deep Deterministic Policy Gradient (DDPG) algorithm, which is an off-policy actor-critic method. Training and simulation studies are performed using Matlab/Simulink and the UAV Toolbox Support Package for PX4 Autopilots. Performance evaluation and comparison studies are performed between the hand-tuned and RL-based tuned approaches. The results show that the controller parameters based on RL are adjusted during flights, achieving the smallest attitude errors, thus significantly improving attitude tracking performance compared to the hand-tuned approach.

Reinforcement Learning Based Prediction of PID Controller Gains for Quadrotor UAVs

TL;DR

This work tackles the challenge of achieving accurate quadrotor trajectory tracking by automatically tuning the inner-loop PD gains via reinforcement learning. A DDPG-based agent is trained offline in MATLAB/Simulink to adjust five normalized gain weights, using a piecewise attitude-error reward, and is validated through numerical simulations, Hardware-In-The-Loop testing, and outdoor flights. The study demonstrates that RL-tuned gains reduce attitude errors and overshoot compared with hand-tuned gains, and the gains can adapt online to disturbances, despite training without some physical effects. The results highlight the potential of RL-based fine-tuning to bridge simulation and real-world UAV control, with practical implications for robust, adaptable autonomous flight and directions for future enhancements such as disturbance-aware training and GPS-robust positioning.

Abstract

A reinforcement learning (RL) based methodology is proposed and implemented for online fine-tuning of PID controller gains, thus, improving quadrotor effective and accurate trajectory tracking. The RL agent is first trained offline on a quadrotor PID attitude controller and then validated through simulations and experimental flights. RL exploits a Deep Deterministic Policy Gradient (DDPG) algorithm, which is an off-policy actor-critic method. Training and simulation studies are performed using Matlab/Simulink and the UAV Toolbox Support Package for PX4 Autopilots. Performance evaluation and comparison studies are performed between the hand-tuned and RL-based tuned approaches. The results show that the controller parameters based on RL are adjusted during flights, achieving the smallest attitude errors, thus significantly improving attitude tracking performance compared to the hand-tuned approach.

Paper Structure

This paper contains 15 sections, 23 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Interaction scheme between the agent and environment sutton2018reinforcement.
  • Figure 2: Block diagram of the PD parameter tuning based on RL.
  • Figure 3: Structure of the actor neural network used for PD parameter tuning.
  • Figure 4: Structure of the critic neural network used for PD parameter tuning.
  • Figure 5: Reinforcement learning Fine-Tuning agent subsystem
  • ...and 5 more figures