Reinforcement Learning-Based Neuroadaptive Control of Robotic Manipulators under Deferred Constraints
Hamed Rahimi Nohooji, Abolfazl Zaraki, Holger Voos
TL;DR
This work tackles constrained trajectory tracking for robotic manipulators under uncertainty by fusing a smooth zone barrier Lyapunov function with a prescribed-time shifting mechanism to enable deferred constraint activation. A model-free actor–critic reinforcement learning scheme learns the control policy online, while a Lyapunov-based stability analysis proves semi-global uniform ultimate boundedness of all signals and satisfaction of time-varying constraints after a prescribed time $T_c$. The critic approximates the cost-to-go and the actor updates the control law $\tau=\hat{W}_a^T S_a(Z_a)-K_2 Z_2-\frac{\gamma Z_1^\gamma}{\beta(k_c^2-Z_1^{\gamma T}Z_1^\gamma)}$, both with regularization to ensure bounded weights. Numerical simulations on a two-link manipulator validate accurate tracking, smooth constraint activation, and bounded adaptive parameters, highlighting potential benefits for energy efficiency and safe operation in uncertain environments.
Abstract
This paper presents a reinforcement learning-based neuroadaptive control framework for robotic manipulators operating under deferred constraints. The proposed approach improves traditional barrier Lyapunov functions by introducing a smooth constraint enforcement mechanism that offers two key advantages: (i) it minimizes control effort in unconstrained regions and progressively increases it near constraints, improving energy efficiency, and (ii) it enables gradual constraint activation through a prescribed-time shifting function, allowing safe operation even when initial conditions violate constraints. To address system uncertainties and improve adaptability, an actor-critic reinforcement learning framework is employed. The critic network estimates the value function, while the actor network learns an optimal control policy in real time, enabling adaptive constraint handling without requiring explicit system modeling. Lyapunov-based stability analysis guarantees the boundedness of all closed-loop signals. The effectiveness of the proposed method is validated through numerical simulations.
