Generating Diverse Challenging Terrains for Legged Robots Using Quality-Diversity Algorithm
Arthur Esquerre-Pourtère, Minsoo Kim, Jaeheung Park
TL;DR
This paper addresses the challenge of evaluating and improving robustness of legged robot controllers on unstructured terrains.It introduces a Quality-Diversity framework based on CMA-MAE to generate a diverse archive of terrains that reveal multiple failure modes while treating the controller as a black box, with a six-descriptor space capturing penalties and a fitness that promotes challenging yet consistent failures, stated as $f(S)=\sum_j\mathrm{mean}(\mathrm{pen}_j(S)) - \alpha\sum_j\mathrm{std}(\mathrm{pen}_j(S)) - \lambda \mathrm{u}(S)$ and $\alpha=1$, $\lambda=2$.Experiments on Cassie (biped) and ANYmal (quadruped) demonstrate the framework's ability to uncover varied weaknesses and show that terrain-induced experiences can be used to fine-tune RL controllers for improved performance on hard terrains.The work advances robustness testing for legged locomotion by providing a scalable method to identify corner cases and by illustrating practical improvements in controller performance through generated terrains.
Abstract
While legged robots have achieved significant advancements in recent years, ensuring the robustness of their controllers on unstructured terrains remains challenging. It requires generating diverse and challenging unstructured terrains to test the robot and discover its vulnerabilities. This topic remains underexplored in the literature. This paper presents a Quality-Diversity framework to generate diverse and challenging terrains that uncover weaknesses in legged robot controllers. Our method, applied to both simulated bipedal and quadruped robots, produces an archive of terrains optimized to challenge the controller in different ways. Quantitative and qualitative analyses show that the generated archive effectively contains terrains that the robots struggled to traverse, presenting different failure modes. Interesting results were observed, including failure cases that were not necessarily expected. Experiments show that the generated terrains can also be used to improve RL-based controllers.
