Safe and Optimal Variable Impedance Control via Certified Reinforcement Learning

Shreyas Kumar; Ravi Prakash

Safe and Optimal Variable Impedance Control via Certified Reinforcement Learning

Shreyas Kumar, Ravi Prakash

TL;DR

This work tackles unstable exploration in variable impedance control when using model-free reinforcement learning. It introduces Certified Gaussian-Manifold Sampling (C-GMS), a trajectory-centric framework that restricts policy exploration to a Lyapunov-certified manifold of stable gain schedules, ensuring stability and actuator feasibility by construction. A convergence theorem guarantees uniformly ultimately bounded tracking error under bounded disturbances, and experiments on both simulation and a real 7-DoF robot demonstrate safe, compliant handover trajectories with an actuator-limit governor. The approach offers a practical route to reliable autonomous interaction in dynamic, uncertain environments by marrying model-free learning with formal stability guarantees.

Abstract

Reinforcement learning (RL) offers a powerful approach for robots to learn complex, collaborative skills by combining Dynamic Movement Primitives (DMPs) for motion and Variable Impedance Control (VIC) for compliant interaction. However, this model-free paradigm often risks instability and unsafe exploration due to the time-varying nature of impedance gains. This work introduces Certified Gaussian Manifold Sampling (C-GMS), a novel trajectory-centric RL framework that learns combined DMP and VIC policies while guaranteeing Lyapunov stability and actuator feasibility by construction. Our approach reframes policy exploration as sampling from a mathematically defined manifold of stable gain schedules. This ensures every policy rollout is guaranteed to be stable and physically realizable, thereby eliminating the need for reward penalties or post-hoc validation. Furthermore, we provide a theoretical guarantee that our approach ensures bounded tracking error even in the presence of bounded model errors and deployment-time uncertainties. We demonstrate the effectiveness of C-GMS in simulation and verify its efficacy on a real robot, paving the way for reliable autonomous interaction in complex environments.

Safe and Optimal Variable Impedance Control via Certified Reinforcement Learning

TL;DR

Abstract

Safe and Optimal Variable Impedance Control via Certified Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)