Grounded Curriculum Learning
Linji Wang, Zifan Xu, Peter Stone, Xuesu Xiao
TL;DR
Grounded Curriculum Learning (GCL) addresses the gap between simulated task distributions and real-world robotics tasks by grounding the curriculum in real data while adaptively shaping task difficulty. It uses a dual-agent setup: a fully informed teacher MDP that generates tasks via a latent space learned by a Variational Autoencoder, and a student POMDP that learns from the real-world-grounded curriculum, with an antagonist guiding learning via regret signals. Empirical results on the BARN navigation benchmark show GCL achieving higher task success, navigation progress, and reward than state-of-the-art CL and human-designed curricula, with ablations confirming the crucial role of real-world grounding and performance-history awareness. This approach offers a practical pathway to more sample-efficient, sim-to-real robust robotic learning by tightly coupling curriculum design to real-world task distributions.
Abstract
The high cost of real-world data for robotics Reinforcement Learning (RL) leads to the wide usage of simulators. Despite extensive work on building better dynamics models for simulators to match with the real world, there is another, often-overlooked mismatch between simulations and the real world, namely the distribution of available training tasks. Such a mismatch is further exacerbated by existing curriculum learning techniques, which automatically vary the simulation task distribution without considering its relevance to the real world. Considering these challenges, we posit that curriculum learning for robotics RL needs to be grounded in real-world task distributions. To this end, we propose Grounded Curriculum Learning (GCL), which aligns the simulated task distribution in the curriculum with the real world, as well as explicitly considers what tasks have been given to the robot and how the robot has performed in the past. We validate GCL using the BARN dataset on complex navigation tasks, achieving a 6.8% and 6.5% higher success rate compared to a state-of-the-art CL method and a curriculum designed by human experts, respectively. These results show that GCL can enhance learning efficiency and navigation performance by grounding the simulation task distribution in the real world within an adaptive curriculum.
