Table of Contents
Fetching ...

Learning Quadrupedal Robot Locomotion for Narrow Pipe Inspection

Jing Guo, Ziwei Wang, Weibang Bai

TL;DR

This work addresses the challenge of navigating quadrupedal robots through narrow pipelines for inspection tasks. It introduces a reinforcement learning framework that leverages privileged information via bidirectional height scanning and a compact, multi-component reward to promote stable, efficient pipe traversal. A three-stage curriculum within a simulated environment, coupled with domain randomization, enables training of a policy that generalizes to varied pipe diameters and obstacle layouts, with real-world demonstrations showing partial transfer. The findings highlight the potential of RL-based locomotion in confined spaces while acknowledging sim-to-real gaps and pointing toward LiDAR-based sensing for improved robustness in practical deployments.

Abstract

Various pipes are extensively used in both industrial settings and daily life, but the pipe inspection especially those with narrow sizes are still very challenging with tremendous time and manufacturing consumed. Quadrupedal robots, inspired from patrol dogs, can be a substitution of traditional solutions but always suffer from navigation and locomotion difficulties. In this paper, we introduce a Reinforcement Learning (RL) based method to train a policy enabling the quadrupedal robots to cross narrow pipes adaptively. A new privileged visual information and a new reward function are defined to tackle the problems. Experiments on both simulation and real world scenarios were completed, demonstrated that the proposed method can achieve the pipe-crossing task even with unexpected obstacles inside.

Learning Quadrupedal Robot Locomotion for Narrow Pipe Inspection

TL;DR

This work addresses the challenge of navigating quadrupedal robots through narrow pipelines for inspection tasks. It introduces a reinforcement learning framework that leverages privileged information via bidirectional height scanning and a compact, multi-component reward to promote stable, efficient pipe traversal. A three-stage curriculum within a simulated environment, coupled with domain randomization, enables training of a policy that generalizes to varied pipe diameters and obstacle layouts, with real-world demonstrations showing partial transfer. The findings highlight the potential of RL-based locomotion in confined spaces while acknowledging sim-to-real gaps and pointing toward LiDAR-based sensing for improved robustness in practical deployments.

Abstract

Various pipes are extensively used in both industrial settings and daily life, but the pipe inspection especially those with narrow sizes are still very challenging with tremendous time and manufacturing consumed. Quadrupedal robots, inspired from patrol dogs, can be a substitution of traditional solutions but always suffer from navigation and locomotion difficulties. In this paper, we introduce a Reinforcement Learning (RL) based method to train a policy enabling the quadrupedal robots to cross narrow pipes adaptively. A new privileged visual information and a new reward function are defined to tackle the problems. Experiments on both simulation and real world scenarios were completed, demonstrated that the proposed method can achieve the pipe-crossing task even with unexpected obstacles inside.

Paper Structure

This paper contains 21 sections, 7 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Overall framework of our method. In the first stage, We introduce bidirectional scandots and the proprioception as the privileged information. In the second stage, we distill into a policy with depth camera as the input.
  • Figure 2: During training, the terrain with multiple pipe channels are designed, which consists of 10 rows and 40 columns. The bottom half of the pipe channels is exposed, for the better visualization and observation of the poses and states of the quadrupedal robots inside the pipe.
  • Figure 3: Design of bidirectional scandots for obtaining both the ceiling (top half) and the floor (bottom half) terrain information of the pipe.
  • Figure 4: Centerline rewards. We sample those points on the robot base and compute the reward based on the distances between those points and the pipe centerline.
  • Figure 5: The screenshots of simulation tests of quadrupedal robot's pipe crossing tasks utilizing the trained policies obtained from the RL approach in 4 different cases. The upper and lower rows represent the pipe crossing without (w/o) and with (w) random obstacles, respectively. The left groups are within the pipes with a radius of 0.3 m, and the right groups are with a radius of 0.2 m. For each group, from a) to c), the quadrupedal robot is stepping into the entrance of the pipe, going through the pipe with the trained RL policies, and going out of the pipe.
  • ...and 1 more figures