Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion: A Case Study on Total Power Saving
Ruiqian Nai, Jiacheng You, Liu Cao, Hanchen Cui, Shiyuan Zhang, Huazhe Xu, Yang Gao
TL;DR
The paper addresses the challenge of optimizing hard-to-simulate objectives, such as total power consumption, in quadruped locomotion by learning a data-driven measurement model from real-world data and integrating it into simulation. The core method combines a multi-objective reward, a data-driven factor predictor, iterative policy refinement with KL constraints, and hierarchical policy selection to bridge sim-to-real gaps. Empirically, the approach yields substantial net power reductions of 24-28% across speeds (vs. a pre-trained baseline), with robust gains in both controlled and in-the-wild conditions, outperforming an analytical proxy baseline. The work offers a practical, adaptable paradigm for continual real-world knowledge incorporation to improve hard-to-simulate objectives in legged robotics, with potential applicability to other domains.
Abstract
Legged locomotion is not just about mobility; it also encompasses crucial objectives such as energy efficiency, safety, and user experience, which are vital for real-world applications. However, key factors such as battery power consumption and stepping noise are often inaccurately modeled or missing in common simulators, leaving these aspects poorly optimized or unaddressed by current sim-to-real methods. Hand-designed proxies, such as mechanical power and foot contact forces, have been used to address these challenges but are often problem-specific and inaccurate. In this paper, we propose a data-driven framework for fine-tuning locomotion policies, targeting these hard-to-simulate objectives. Our framework leverages real-world data to model these objectives and incorporates the learned model into simulation for policy improvement. We demonstrate the effectiveness of our framework on power saving for quadruped locomotion, achieving a significant 24-28\% net reduction in total power consumption from the battery pack at various speeds. In essence, our approach offers a versatile solution for optimizing hard-to-simulate objectives in quadruped locomotion, providing an easy-to-adapt paradigm for continual improving with real-world knowledge. Project page https://hard-to-sim.github.io/.
