RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations
Ezra Ameperosa, Jeremy A. Collins, Mrinal Jain, Animesh Garg
TL;DR
RoCoDA introduces a unified data augmentation framework that blends causality, $SE(3)$-equivariance, and visual invariances to enhance data-efficient imitation learning for robotics. By decomposing state into causally relevant/irrelevant parts, applying SE(3) pose transformations with corresponding action adjustments, and incorporating standard augmentations, RoCoDA generates diverse, causally consistent demonstrations. Empirical results across five manipulation tasks show improved policy performance, stronger generalization to unseen poses and textures, and greater sample efficiency compared to baselines like MimicGen. The approach provides a principled bridge between geometric symmetries and causal reasoning, enabling more robust and adaptable robotic policies.
Abstract
Imitation learning in robotics faces significant challenges in generalization due to the complexity of robotic environments and the high cost of data collection. We introduce RoCoDA, a novel method that unifies the concepts of invariance, equivariance, and causality within a single framework to enhance data augmentation for imitation learning. RoCoDA leverages causal invariance by modifying task-irrelevant subsets of the environment state without affecting the policy's output. Simultaneously, we exploit SE(3) equivariance by applying rigid body transformations to object poses and adjusting corresponding actions to generate synthetic demonstrations. We validate RoCoDA through extensive experiments on five robotic manipulation tasks, demonstrating improvements in policy performance, generalization, and sample efficiency compared to state-of-the-art data augmentation methods. Our policies exhibit robust generalization to unseen object poses, textures, and the presence of distractors. Furthermore, we observe emergent behavior such as re-grasping, indicating policies trained with RoCoDA possess a deeper understanding of task dynamics. By leveraging invariance, equivariance, and causality, RoCoDA provides a principled approach to data augmentation in imitation learning, bridging the gap between geometric symmetries and causal reasoning. Project Page: https://rocoda.github.io
