Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems

Ashwin P. Dani; Shubhendu Bhasin

Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems

Ashwin P. Dani, Shubhendu Bhasin

TL;DR

This work tackles optimal regulation for drift-free nonlinear systems with unknown input gain matrices $g(x,\theta)$ by formulating a continuous-time adaptive actor-critic (AAC) reinforcement learning controller. It employs concurrent learning to identify the constant parameter vector $\theta$ in $g(x,\theta)$ while critic and actor NNs approximate the value function $V^*(\bar{x})$ and the optimal policy $u^*$, guided by the Bellman error $\delta$. A Lyapunov-based analysis shows the closed-loop signals are uniformly ultimately bounded (UUB), with a finite-excitation condition ensuring parameter convergence and a sigma-modification safeguard when excitation is incomplete. Simulation studies on image-based visual servoing (IBVS) and wheeled mobile robots (WMR) validate near-optimal regulation, bounded weights, and convergence of parameter estimates, demonstrating practical applicability to robotics with uncertain input gain.

Abstract

In this paper, a continuous-time adaptive actor-critic reinforcement learning (RL) controller is developed for drift-free nonlinear systems. Practical examples of such systems are image-based visual servoing (IBVS) and wheeled mobile robots (WMR), where the system dynamics includes a parametric uncertainty in the control effectiveness matrix with no drift term. The uncertainty in the input term poses a challenge for developing a continuous-time RL controller using existing methods. In this paper, an actor-critic or synchronous policy iteration (PI)-based RL controller is presented with a concurrent learning (CL)-based parameter update law for estimating the unknown parameters of the control effectiveness matrix. An infinite-horizon value function minimization objective is achieved by regulating the current states to the desired with near-optimal control efforts. The proposed controller guarantees closed-loop stability and simulation results validate the proposed theory using IBVS and WMR examples.

Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems

TL;DR

This work tackles optimal regulation for drift-free nonlinear systems with unknown input gain matrices

by formulating a continuous-time adaptive actor-critic (AAC) reinforcement learning controller. It employs concurrent learning to identify the constant parameter vector

while critic and actor NNs approximate the value function

and the optimal policy

, guided by the Bellman error

. A Lyapunov-based analysis shows the closed-loop signals are uniformly ultimately bounded (UUB), with a finite-excitation condition ensuring parameter convergence and a sigma-modification safeguard when excitation is incomplete. Simulation studies on image-based visual servoing (IBVS) and wheeled mobile robots (WMR) validate near-optimal regulation, bounded weights, and convergence of parameter estimates, demonstrating practical applicability to robotics with uncertain input gain.

Abstract

Paper Structure (17 sections, 1 theorem, 45 equations, 4 figures)

This paper contains 17 sections, 1 theorem, 45 equations, 4 figures.

Introduction
System Model and Control Objective
System Dynamics
Controller Objective
Optimal Control Design using Actor-Critic Structure
Continuous RL-based Controller Design
Hamiltonian and Bellman Error
Approximate Optimal Control
Parameter Update Law
Bellman Error
Critic NN Weight Update Law
Actor NN Weight Update Law
Stability Analysis
Simulation Studies
Optimal IBVS Controller
...and 2 more sections

Key Result

Theorem 1

Given that the Assumptions 1-5 hold and the following sufficient condition is satisfied the actor-critic controller (eq:ApproxValueControl) along with the model parameter update law in (eq:thetaHatDot) and critic and actor weight update laws in (eq:wCHatUpdate)-(eq:GammaUpdate), (eq:wAHatUpdate) guarantee that the signals $\bar{x}(t)$, $\tilde{\theta}(t)$, $\tilde{W}_a(t)$ and $\tilde{

Figures (4)

Figure 1: IBVS: (a) Regulation errors, (b) Control velocities, (c) Value.
Figure 2: IBVS: (a) Parameter estimates, (b) Critic weights, (c) Actor weights.
Figure 3: WMR regulation: (a) Regulation errors, (b) Control velocities, (c) Value.
Figure 4: WMR regulation: (a) Parameter estimates along with true parameters, (b) Critic weights, (c) Actor weights.

Theorems & Definitions (9)

Remark 1
Remark 2
Remark 3
Remark 4
Theorem 1
proof
Remark 5
Remark 6
Remark 7

Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems

TL;DR

Abstract

Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)