Table of Contents
Fetching ...

Reinforcement Learning-Based Neuroadaptive Control of Robotic Manipulators under Deferred Constraints

Hamed Rahimi Nohooji, Abolfazl Zaraki, Holger Voos

TL;DR

This work tackles constrained trajectory tracking for robotic manipulators under uncertainty by fusing a smooth zone barrier Lyapunov function with a prescribed-time shifting mechanism to enable deferred constraint activation. A model-free actor–critic reinforcement learning scheme learns the control policy online, while a Lyapunov-based stability analysis proves semi-global uniform ultimate boundedness of all signals and satisfaction of time-varying constraints after a prescribed time $T_c$. The critic approximates the cost-to-go and the actor updates the control law $\tau=\hat{W}_a^T S_a(Z_a)-K_2 Z_2-\frac{\gamma Z_1^\gamma}{\beta(k_c^2-Z_1^{\gamma T}Z_1^\gamma)}$, both with regularization to ensure bounded weights. Numerical simulations on a two-link manipulator validate accurate tracking, smooth constraint activation, and bounded adaptive parameters, highlighting potential benefits for energy efficiency and safe operation in uncertain environments.

Abstract

This paper presents a reinforcement learning-based neuroadaptive control framework for robotic manipulators operating under deferred constraints. The proposed approach improves traditional barrier Lyapunov functions by introducing a smooth constraint enforcement mechanism that offers two key advantages: (i) it minimizes control effort in unconstrained regions and progressively increases it near constraints, improving energy efficiency, and (ii) it enables gradual constraint activation through a prescribed-time shifting function, allowing safe operation even when initial conditions violate constraints. To address system uncertainties and improve adaptability, an actor-critic reinforcement learning framework is employed. The critic network estimates the value function, while the actor network learns an optimal control policy in real time, enabling adaptive constraint handling without requiring explicit system modeling. Lyapunov-based stability analysis guarantees the boundedness of all closed-loop signals. The effectiveness of the proposed method is validated through numerical simulations.

Reinforcement Learning-Based Neuroadaptive Control of Robotic Manipulators under Deferred Constraints

TL;DR

This work tackles constrained trajectory tracking for robotic manipulators under uncertainty by fusing a smooth zone barrier Lyapunov function with a prescribed-time shifting mechanism to enable deferred constraint activation. A model-free actor–critic reinforcement learning scheme learns the control policy online, while a Lyapunov-based stability analysis proves semi-global uniform ultimate boundedness of all signals and satisfaction of time-varying constraints after a prescribed time . The critic approximates the cost-to-go and the actor updates the control law , both with regularization to ensure bounded weights. Numerical simulations on a two-link manipulator validate accurate tracking, smooth constraint activation, and bounded adaptive parameters, highlighting potential benefits for energy efficiency and safe operation in uncertain environments.

Abstract

This paper presents a reinforcement learning-based neuroadaptive control framework for robotic manipulators operating under deferred constraints. The proposed approach improves traditional barrier Lyapunov functions by introducing a smooth constraint enforcement mechanism that offers two key advantages: (i) it minimizes control effort in unconstrained regions and progressively increases it near constraints, improving energy efficiency, and (ii) it enables gradual constraint activation through a prescribed-time shifting function, allowing safe operation even when initial conditions violate constraints. To address system uncertainties and improve adaptability, an actor-critic reinforcement learning framework is employed. The critic network estimates the value function, while the actor network learns an optimal control policy in real time, enabling adaptive constraint handling without requiring explicit system modeling. Lyapunov-based stability analysis guarantees the boundedness of all closed-loop signals. The effectiveness of the proposed method is validated through numerical simulations.

Paper Structure

This paper contains 8 sections, 3 theorems, 49 equations, 5 figures.

Key Result

Lemma 1

For any $T_c>0$, the function $\gamma(t)$ satisfies:

Figures (5)

  • Figure 1: Joint Position Tracking with Constraints. The desired trajectory $q_d(t)$ (dashed) and the actual joint positions $q(t)$ (solid) are plotted along with the constraint boundaries $q_{d,i}(t)\pm k_{c,i}(t)$ (red dashed), where $k_{c1}(t)=0.5+0.1\sin(0.5t)$ and $k_{c2}(t)=0.45+0.1\cos(0.5t)$.
  • Figure 2: Joint Velocity Tracking. The desired velocities (dashed) and actual velocities (solid) are shown for both joints.
  • Figure 3: Position Tracking Errors with Constraint Boundaries. The tracking errors $Z_{1i}(t)=q_i(t)-q_{d,i}(t)$ are plotted together with the error bounds $\pm k_{c,i}(t)$ for $i=1,2$.
  • Figure 4: Norms of actor networks $\hat{W}_{ai}(t)$ for both joint are plotted.
  • Figure 5: Input Torques. The control inputs $\tau_i(t)$ for each joint are plotted.

Theorems & Definitions (6)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • proof
  • Remark 1
  • proof