Table of Contents
Fetching ...

TamedPUMA: safe and stable imitation learning with geometric fabrics

Saray Bakker, Rodrigo Pérez-Dattari, Cosimo Della Santina, Wendelin Böhmer, Javier Alonso-Mora

TL;DR

Imitation Learning for robot motion often relies on task-space dynamics that risk violating safety constraints. This work introduces TamedPUMA, a framework that couples learned stable motion primitives with geometric fabrics to yield $2^{\mathrm{nd}}$-order, constraint-aware motion policies. It presents two extensions, the Forcing Policy Method (FPM) and the Compatible Potential Method (CPM), to ensure stability and, in CPM, stronger convergence guarantees under collision avoidance and joint limits. Experiments on a simulated and real 7-DOF manipulator show high success rates, real-time computation on the order of $5{-}7$ ms, and robust collision avoidance even with dynamic obstacles and online goal changes.

Abstract

Using the language of dynamical systems, Imitation learning (IL) provides an intuitive and effective way of teaching stable task-space motions to robots with goal convergence. Yet, IL techniques are affected by serious limitations when it comes to ensuring safety and fulfillment of physical constraints. With this work, we solve this challenge via TamedPUMA, an IL algorithm augmented with a recent development in motion generation called geometric fabrics. As both the IL policy and geometric fabrics describe motions as artificial second-order dynamical systems, we propose two variations where IL provides a navigation policy for geometric fabrics. The result is a stable imitation learning strategy within which we can seamlessly blend geometrical constraints like collision avoidance and joint limits. Beyond providing a theoretical analysis, we demonstrate TamedPUMA with simulated and real-world tasks, including a 7-DoF manipulator.

TamedPUMA: safe and stable imitation learning with geometric fabrics

TL;DR

Imitation Learning for robot motion often relies on task-space dynamics that risk violating safety constraints. This work introduces TamedPUMA, a framework that couples learned stable motion primitives with geometric fabrics to yield -order, constraint-aware motion policies. It presents two extensions, the Forcing Policy Method (FPM) and the Compatible Potential Method (CPM), to ensure stability and, in CPM, stronger convergence guarantees under collision avoidance and joint limits. Experiments on a simulated and real 7-DOF manipulator show high success rates, real-time computation on the order of ms, and robust collision avoidance even with dynamic obstacles and online goal changes.

Abstract

Using the language of dynamical systems, Imitation learning (IL) provides an intuitive and effective way of teaching stable task-space motions to robots with goal convergence. Yet, IL techniques are affected by serious limitations when it comes to ensuring safety and fulfillment of physical constraints. With this work, we solve this challenge via TamedPUMA, an IL algorithm augmented with a recent development in motion generation called geometric fabrics. As both the IL policy and geometric fabrics describe motions as artificial second-order dynamical systems, we propose two variations where IL provides a navigation policy for geometric fabrics. The result is a stable imitation learning strategy within which we can seamlessly blend geometrical constraints like collision avoidance and joint limits. Beyond providing a theoretical analysis, we demonstrate TamedPUMA with simulated and real-world tasks, including a 7-DoF manipulator.

Paper Structure

This paper contains 15 sections, 1 theorem, 10 equations, 4 figures, 1 table.

Key Result

theorem 1

In the region $\mathcal{T}$, $\bm{x}_{\mathrm{g}}$ is a globally asymptotically stable equilibrium of $\bm{f}_{\theta}^{\mathcal{T}}$ if, $\forall \bm{x} \in \mathcal{T}$, (1) $\bm{y}_{\mathrm{g}}=\bm{\rho}_{\theta}(\bm{x}_{\mathrm{g}})$ is a globally asymptotically stable equilibrium of $\bm{f}_{\t

Figures (4)

  • Figure 1: This illustration of TamedPUMA shows the behavior design given the relationships between the different task and configuration-space variables. The joint angles and velocities get mapped into task space where the desired behavior is specified. Via fabrics, all avoidance behaviors are defined using the joint limits, and varying obstacle positions, e.g. the position of the bowl. The DNN captures the desired behavior of the end-effector position and orientation.
  • Figure 2: Selected time frames of cpm during a tomato-picking task with the bowl and hand as dynamic obstacles.
  • Figure 3: Selected time frames of cpm during a pouring task with the yellow helmet as a static obstacle.
  • Figure 4: Selected time frames of cpm during a pouring task where the goal is changed online by the user.

Theorems & Definitions (2)

  • theorem 1: Stability conditions perez2023deepmetric
  • definition 1: Compatible potential ratliff2023fabrics