Sim-to-Real Transfer in Deep Reinforcement Learning for Bipedal Locomotion

Lingfan Bao; Tianhu Peng; Chengxu Zhou

Sim-to-Real Transfer in Deep Reinforcement Learning for Bipedal Locomotion

Lingfan Bao, Tianhu Peng, Chengxu Zhou

TL;DR

The chapter addresses sim-to-real transfer for DRL-based bipedal locomotion by diagnosing the gap between high-fidelity simulation and hardware, and proposing a strategic framework that combines model fidelity improvements (pre-training alignment and residual dynamics) with policy hardening (domain randomization, teacher-student training) and online adaptation. It surveys end-to-end and hierarchical control schemes, identifies main sources of mismatch in dynamics, contacts, sensing, and solvers, and details three practical levers to bridge the gap while emphasizing integration over any single method. Key contributions include formal offline system identification, residual dynamics learning, curriculum-guided domain randomization, and explicit/implicit online system identification to enable robust, scalable sim-to-real transfer. The practical impact lies in providing a structured, end-to-end roadmap for developing and evaluating robust sim-to-real solutions in legged robotics, with emphasis on verifiability, safety, and real-world adaptability.

Abstract

This chapter addresses the critical challenge of simulation-to-reality (sim-to-real) transfer for deep reinforcement learning (DRL) in bipedal locomotion. After contextualizing the problem within various control architectures, we dissect the ``curse of simulation'' by analyzing the primary sources of sim-to-real gap: robot dynamics, contact modeling, state estimation, and numerical solvers. Building on this diagnosis, we structure the solutions around two complementary philosophies. The first is to shrink the gap through model-centric strategies that systematically improve the simulator's physical fidelity. The second is to harden the policy, a complementary approach that uses in-simulation robustness training and post-deployment adaptation to make the policy inherently resilient to model inaccuracies. The chapter concludes by synthesizing these philosophies into a strategic framework, providing a clear roadmap for developing and evaluating robust sim-to-real solutions.

Sim-to-Real Transfer in Deep Reinforcement Learning for Bipedal Locomotion

TL;DR

Abstract

Sim-to-Real Transfer in Deep Reinforcement Learning for Bipedal Locomotion

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)