Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

Vincent W. Hill

Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

Vincent W. Hill

TL;DR

This paper tackles disturbance rejection for a nonlinear, uncertain flexible inverted pendulum on a cart (FIPWC) using Deep Deterministic Policy Gradient (DDPG) with an actor-critic network to learn a continuous force policy. Disturbances are modeled as Ornstein–Uhlenbeck processes and the policy is trained in a Gym/Keras-RL environment by maximizing the expected return $J = \mathbb{E}[r_1]$ with the reward $r$ penalizing state deviation and control effort: $r = -\Delta t ( \sum_{i} w_i (x_i - x_{i,des})^2 + 0.1 F^2 )$. The FIPWC dynamics are derived from a Lagrangian, include parametric uncertainty in stiffness and damping, and are integrated with a fourth-order Runge-Kutta method; DRL results are benchmarked against a PD controller via 10k Monte Carlo trials. The findings indicate DRL provides superior disturbance rejection and tighter state convergence in the presence of stochastic disturbances, suggesting practical potential for robust control in uncertain nonlinear systems, though improvements in actuator realism and estimation could further enhance applicability.

Abstract

This work describes a technique for active rejection of multiple independent and time-correlated stochastic disturbances for a nonlinear flexible inverted pendulum with cart system with uncertain model parameters. The control law is determined through deep reinforcement learning, specifically with a continuous actor-critic variant of deep Q-learning known as Deep Deterministic Policy Gradient, while the disturbance magnitudes evolve via independent stochastic processes. Simulation results are then compared with those from a classical control system.

Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

TL;DR

with the reward

penalizing state deviation and control effort:

. The FIPWC dynamics are derived from a Lagrangian, include parametric uncertainty in stiffness and damping, and are integrated with a fourth-order Runge-Kutta method; DRL results are benchmarked against a PD controller via 10k Monte Carlo trials. The findings indicate DRL provides superior disturbance rejection and tighter state convergence in the presence of stochastic disturbances, suggesting practical potential for robust control in uncertain nonlinear systems, though improvements in actuator realism and estimation could further enhance applicability.

Abstract

Paper Structure (8 sections, 25 equations, 12 figures, 2 tables)

This paper contains 8 sections, 25 equations, 12 figures, 2 tables.

Introduction
Technical Approach
Flexible Inverted Pendulum with Cart System Modeling
Parametric Uncertainty Modeling
Deep Reinforcement Learning Control
Simulation Results
Discussion
Conclusions

Figures (12)

Figure 1: FIPWC system diagram.
Figure 2: FIPWC system diagram.
Figure 3: Sample time history of cart disturbance stochastic process.
Figure 4: Sample time history of pendulum disturbance stochastic process.
Figure 5: Block diagram for actor-critic agent.
...and 7 more figures

Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

TL;DR

Abstract

Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Figures (12)