Constrained Behavior Cloning for Robotic Learning
Wensheng Liang, Jun Xie, Zhicheng Wang, Jianwei Tan, Xiaoguang Ma
TL;DR
This work tackles the brittleness of behavior cloning in robotics due to limited perception and distribution shift by introducing GHCBC, a framework that fuses historically constrained BC (HCBC) with geometrically constrained BC (GCBC). HCBC encodes vision and action histories via a Transformer-based encoder with KL regularization to capture temporal structure, while GCBC imposes high-level geometric constraints on joint and end-effector poses to focus learning on relative pose information. The integrated Constrained Pose Transformer (GHCBC) predicts action sequences using vision, geometry, and history, employing action chunking and temporal ensembling, with history buffers that reset on gripper state changes. Empirical results on RLBench tasks with Sawyer robots show substantial improvements over a state-of-the-art BC baseline in both simulation and real-world settings, highlighting gains in robustness, stability, and long-horizon manipulation capabilities that are valuable for practical robotic imitation learning.
Abstract
Behavior cloning (BC) is a popular supervised imitation learning method in the societies of robotics, autonomous driving, etc., wherein complex skills can be learned by direct imitation from expert demonstrations. Despite its rapid development, it is still affected by limited field of view where accumulation of sensors and joint noise bring compounding errors. In this paper, we introduced geometrically and historically constrained behavior cloning (GHCBC) to dominantly consider high-level state information inspired by neuroscientists, wherein the geometrically constrained behavior cloning were used to geometrically constrain predicting poses, and the historically constrained behavior cloning were utilized to temporally constrain action sequences. The synergy between these two types of constrains enhanced the BC performance in terms of robustness and stability. Comprehensive experimental results showed that success rates were improved by 29.73% in simulation and 39.4% in real robot experiments in average, respectively, compared to state-of-the-art BC method, especially in long-term operational scenes, indicating great potential of using the GHCBC for robotic learning.
