Towards Online Safety Corrections for Robotic Manipulation Policies

Ariana Spalter; Mark Roberts; Laura M. Hiatt

Towards Online Safety Corrections for Robotic Manipulation Policies

Ariana Spalter, Mark Roberts, Laura M. Hiatt

TL;DR

A hybrid approach is presented, called iKinQP-RL, that uses an Inverse Kinematics Quadratic Programming (iKinQP) controller to correct actions proposed by an RL policy at runtime to ensure safe execution in the presence of new obstacles not present during training.

Abstract

Recent successes in applying reinforcement learning (RL) for robotics has shown it is a viable approach for constructing robotic controllers. However, RL controllers can produce many collisions in environments where new obstacles appear during execution. This poses a problem in safety-critical settings. We present a hybrid approach, called iKinQP-RL, that uses an Inverse Kinematics Quadratic Programming (iKinQP) controller to correct actions proposed by an RL policy at runtime. This ensures safe execution in the presence of new obstacles not present during training. Preliminary experiments illustrate our iKinQP-RL framework completely eliminates collisions with new obstacles while maintaining a high task success rate.

Towards Online Safety Corrections for Robotic Manipulation Policies

TL;DR

Abstract

Paper Structure (13 sections, 2 equations, 7 figures, 4 tables, 2 algorithms)

This paper contains 13 sections, 2 equations, 7 figures, 4 tables, 2 algorithms.

Introduction
Background
The iKinQP-RL Approach
Experiment and Results
Experimental Setup
Results
Conclusion
Experimental Parameters
Modified iKinQP Algorithm
Grounding Choice of $n<m$ Intermediate Points Taken from iKinQP Output
Visualizing Closest Distance to Colliding Per Episode
RL Experiment Details
Deriving Failsafe Joint Positions

Figures (7)

Figure 1: iKinQP collision corrections for simplified 2D example.
Figure 2: Proposed iKinQP-RL Framework
Figure 3: Experimental Conditions for Block-1
Figure 4: Red block location chosen to directly block the path of the arm in getting to the goal region over the yellow block.
Figure 5: Red block location chosen to partially block the path of the arm in getting to the goal region over the yellow block.
...and 2 more figures

Towards Online Safety Corrections for Robotic Manipulation Policies

TL;DR

Abstract

Towards Online Safety Corrections for Robotic Manipulation Policies

Authors

TL;DR

Abstract

Table of Contents

Figures (7)