SLR: Learning Quadruped Locomotion without Privileged Information

Shiyi Chen; Zeyu Wan; Shiyang Yan; Chun Zhang; Weiyi Zhang; Qiang Li; Debing Zhang; Fasih Ud Din Farrukh

SLR: Learning Quadruped Locomotion without Privileged Information

Shiyi Chen, Zeyu Wan, Shiyang Yan, Chun Zhang, Weiyi Zhang, Qiang Li, Debing Zhang, Fasih Ud Din Farrukh

TL;DR

This work proposes a Self-learning Latent Representation (SLR) method, which achieves high-performance control policy learning without the need for privileged information, and surpasses the performance of previous methods using only limited proprioceptive data.

Abstract

The recent mainstream reinforcement learning control for quadruped robots often relies on privileged information, demanding meticulous selection and precise estimation, thereby imposing constraints on the development process. This work proposes a Self-learning Latent Representation (SLR) method, which achieves high-performance control policy learning without the need for privileged information. To enhance the credibility of the proposed method's evaluation, SLR was directly compared with state-of-the-art algorithms using their open-source code repositories and original configuration parameters. Remarkably, SLR surpasses the performance of previous methods using only limited proprioceptive data, demonstrating significant potential for future applications. Ultimately, the trained policy and encoder empower the quadruped robot to traverse various challenging terrains. Videos of our results can be found on our website: https://11chens.github.io/SLR/

SLR: Learning Quadruped Locomotion without Privileged Information

TL;DR

Abstract

Paper Structure (20 sections, 5 equations, 8 figures, 6 tables, 1 algorithm)

This paper contains 20 sections, 5 equations, 8 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Method
Problem Formulation
Framework Overview
Experiments
Ablation Study for Latent Representations
Latent Representation Analysis
Results
Compared Methods
Simulation
Deploy in Real-World
Discussion and Limitations
Appendix
Self-learning Latent Representation Details
...and 5 more sections

Figures (8)

Figure 1: We propose a framework for training a robust quadruped locomotion policy without relying on privileged information. The robot effectively navigates challenging terrains, showcasing adaptive locomotion skills acquired through self-learning.
Figure 2: Illustration of SLR training framework. All dashed lines represent the network updating process. The translucent fuchsia lines indicate the encoder updates through backpropagation from the Critic network, the Transition model, and random sampling. The remaining solid lines represent the network's forward inference process.
Figure 3: Ablation study training curves, curves are averaged over 3 seeds. The shaded area represents the standard deviation across seeds, and the curves are smoothed using Gaussian filtering.
Figure 4: t-SNE test terrains.
Figure 5: t-SNE visualization of Implicit (left) and SLR (right). Color intensity represents cumulative steps across four terrains. The privileged latent distribution is discrete and weakly correlated with terrain. In contrast, the SLR latent trajectories align precisely with the terrains traversed by the robot, with each ring-like representation accompanied by a "tail", indicating terrain transitions.
...and 3 more figures

SLR: Learning Quadruped Locomotion without Privileged Information

TL;DR

Abstract

SLR: Learning Quadruped Locomotion without Privileged Information

Authors

TL;DR

Abstract

Table of Contents

Figures (8)