Towards Online Safety Corrections for Robotic Manipulation Policies
Ariana Spalter, Mark Roberts, Laura M. Hiatt
TL;DR
A hybrid approach is presented, called iKinQP-RL, that uses an Inverse Kinematics Quadratic Programming (iKinQP) controller to correct actions proposed by an RL policy at runtime to ensure safe execution in the presence of new obstacles not present during training.
Abstract
Recent successes in applying reinforcement learning (RL) for robotics has shown it is a viable approach for constructing robotic controllers. However, RL controllers can produce many collisions in environments where new obstacles appear during execution. This poses a problem in safety-critical settings. We present a hybrid approach, called iKinQP-RL, that uses an Inverse Kinematics Quadratic Programming (iKinQP) controller to correct actions proposed by an RL policy at runtime. This ensures safe execution in the presence of new obstacles not present during training. Preliminary experiments illustrate our iKinQP-RL framework completely eliminates collisions with new obstacles while maintaining a high task success rate.
