Table of Contents
Fetching ...

Actor-Critic Cooperative Compensation to Model Predictive Control for Off-Road Autonomous Vehicles Under Unknown Dynamics

Prakhar Gupta, Jonathon M Smereka, Yunyi Jia

TL;DR

Problem addressed: achieving reliable longitudinal tracking for off-road vehicles with unknown terrain dynamics. Approach: a cooperative parallel compensation scheme (AC3MPC) that couples a model predictive controller with a learning-based actor-critic, using augmented dynamics to anticipate compensation. Key contributions: a data-efficient learning framework, preservation of MPC horizon robustness, and improved tracking across unseen deformable terrains. Findings: AC3MPC outperforms standalone MPC and AC by up to 29.2% and 10.2% in RMS tracking error, demonstrates better generalization, and requires less training data, including workable under-trained performance. Significance: the method enables safer, more efficient real-time control on deformable terrains with potential deployment on real drive systems.

Abstract

This study presents an Actor-Critic Cooperative Compensated Model Predictive Controller (AC3MPC) designed to address unknown system dynamics. To avoid the difficulty of modeling highly complex dynamics and ensuring realtime control feasibility and performance, this work uses deep reinforcement learning with a model predictive controller in a cooperative framework to handle unknown dynamics. The model-based controller takes on the primary role as both controllers are provided with predictive information about the other. This improves tracking performance and retention of inherent robustness of the model predictive controller. We evaluate this framework for off-road autonomous driving on unknown deformable terrains that represent sandy deformable soil, sandy and rocky soil, and cohesive clay-like deformable soil. Our findings demonstrate that our controller statistically outperforms standalone model-based and learning-based controllers by upto 29.2% and 10.2%. This framework generalized well over varied and previously unseen terrain characteristics to track longitudinal reference speeds with lower errors. Furthermore, this required significantly less training data compared to purely learning-based controller, while delivering better performance even when under-trained.

Actor-Critic Cooperative Compensation to Model Predictive Control for Off-Road Autonomous Vehicles Under Unknown Dynamics

TL;DR

Problem addressed: achieving reliable longitudinal tracking for off-road vehicles with unknown terrain dynamics. Approach: a cooperative parallel compensation scheme (AC3MPC) that couples a model predictive controller with a learning-based actor-critic, using augmented dynamics to anticipate compensation. Key contributions: a data-efficient learning framework, preservation of MPC horizon robustness, and improved tracking across unseen deformable terrains. Findings: AC3MPC outperforms standalone MPC and AC by up to 29.2% and 10.2% in RMS tracking error, demonstrates better generalization, and requires less training data, including workable under-trained performance. Significance: the method enables safer, more efficient real-time control on deformable terrains with potential deployment on real drive systems.

Abstract

This study presents an Actor-Critic Cooperative Compensated Model Predictive Controller (AC3MPC) designed to address unknown system dynamics. To avoid the difficulty of modeling highly complex dynamics and ensuring realtime control feasibility and performance, this work uses deep reinforcement learning with a model predictive controller in a cooperative framework to handle unknown dynamics. The model-based controller takes on the primary role as both controllers are provided with predictive information about the other. This improves tracking performance and retention of inherent robustness of the model predictive controller. We evaluate this framework for off-road autonomous driving on unknown deformable terrains that represent sandy deformable soil, sandy and rocky soil, and cohesive clay-like deformable soil. Our findings demonstrate that our controller statistically outperforms standalone model-based and learning-based controllers by upto 29.2% and 10.2%. This framework generalized well over varied and previously unseen terrain characteristics to track longitudinal reference speeds with lower errors. Furthermore, this required significantly less training data compared to purely learning-based controller, while delivering better performance even when under-trained.

Paper Structure

This paper contains 9 sections, 4 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: AC3MPC controller training and simulation framework utilizes optimal control and learning paradigms. Cooperative and predictive information is exchanged to build anticipation of compensation and allows MPC to dictate control.
  • Figure 2: Performance of all controllers is plotted for 13 evaluation scenarios: Driving on rigid terrain ($T0_r$), driving on deformable terrains ($T1_r$,$T2_r$,$T3_r$), and driving on the same deformable terrains with under-trained controllers ($T1_r^u$, $T2_r^u$, $T3_r^u$). RMS tracking error and smoothness measure is plotted for all these scenarios and our framework has lowest errors and jerks in general, even when under-trained.
  • Figure 3: Controller comparison for velocity tracking on a previously unseen soft clay-like terrain $T3$ for constant and varying reference velocity scenarios $T3_C$ and $T3_V$ respectively.
  • Figure 4: Control effort split for scenario $T3_C$: This figure shows the learnt compensation (AC3MPCRL) that is added to the primary input (AC3MPCMPC) while maintaining smooth control.
  • Figure 5: Data requirements for compensated controller framework are lower. Performance for scenarios $T1_C^u$ and $T1_V^u$ is visualized here