Learning to walk in confined spaces using 3D representation

Takahiro Miki; Joonho Lee; Lorenz Wellhausen; Marco Hutter

Learning to walk in confined spaces using 3D representation

Takahiro Miki, Joonho Lee, Lorenz Wellhausen, Marco Hutter

TL;DR

A method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments and extends the applicability of legged robots to a broader range of scenarios.

Abstract

Legged robots have the potential to traverse complex terrain and access confined spaces beyond the reach of traditional platforms thanks to their ability to carefully select footholds and flexibly adapt their body posture while walking. However, robust deployment in real-world applications is still an open challenge. In this paper, we present a method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments. By employing a two-layer hierarchical policy structure, we exploit the capabilities of a highly robust low-level policy to follow 6D commands and a high-level policy to enable three-dimensional spatial awareness for navigating under overhanging obstacles. Our study includes the development of a procedural terrain generator to create diverse training environments. We present a series of experimental evaluations in both simulation and real-world settings, demonstrating the effectiveness of our approach in controlling a quadruped robot in confined, rough terrain. By achieving this, our work extends the applicability of legged robots to a broader range of scenarios.

Learning to walk in confined spaces using 3D representation

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 8 figures)

This paper contains 15 sections, 3 equations, 8 figures.

INTRODUCTION
RELATED WORKS
METHODS
Overview
Low-level locomotion policy
High-level teacher policy
High-level student policy training
Procedural Terrain Generation
Experimental Results
Evalution in simulation
Real-World Experiments
Low-level policy
Combined policy in the field
Simulation details
CONCLUSIONS

Figures (8)

Figure 1: Real-world experiment: Successful confined space traversal by the quadruped robot including a simulated collapsed building environment. The terrain consists of loose gravel or unstable steps, while the overhead structures have tilted configurations with narrow openings. The robot could adapt its posture to traverse these challenging conditions.
Figure 2: Overview of our method. We use a two-layer policy setup. The low-level policy learns to walk over rough terrain while following 6D commands consisting of x,y, yaw rate and roll pitch and body height. The high-level policy is trained in a procedurally generated confined environment to guide the robot by giving commands to the low-level policy. We first train a low-level teacher policy and then distill it into a low-level student policy. We follow this by training a high-level teacher policy using spherical scans for exteroceptive perception. Finally, we distill this into a high-level student policy using noisy voxel grids as exteroceptive input.
Figure 3: Procedural terrain generation. We generate the terrain mesh by tiling mesh parts. First, the tiles are connected procedurally based on connectivity calculated from terrain height array. Then, overhanging obstacles are added on top of the mesh.
Figure 4: Success rate of different overhanging obstacle height and obstacle box height from the ground. The $x$ axis show different methods and $y$ axis shows the different parameters of the obstacles. For the obstacle + overhanging, we used the obstacle height of $0.25$m and varied the height of the overhanging box. The baseline methods which always walk at normal height (High) and always walk with crouching (Low) was compared against our method. The results show that the combination of overhanging and rough terrain needs an adaptive body height control.
Figure 5: Sequence on evaluation terrains. We have evaluated the policy's performance on the evaluation terrains where it has three terrain types with different parameters. Overhanging has an overhanging box in the middle, and Obstacle has a box on the ground, Overhanging + obstacle is a combination of both obstacle and overhanging boxes.
...and 3 more figures

Learning to walk in confined spaces using 3D representation

TL;DR

Abstract

Learning to walk in confined spaces using 3D representation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)