Table of Contents
Fetching ...

Neural Randomized Planning for Whole Body Robot Motion

Yunfan Lu, Yuchen Ma, David Hsu, Panpan Cai

TL;DR

The paper tackles the challenge of real-time, long-range whole-body motion planning for high-DOF robots in cluttered homes. It introduces Neural Randomized Planner (NRP), which combines a global SBMP backbone (RRT/IRRT*-based) with a locally learned neural sampler to generate optimal waypoints conditioned on local surroundings, then stitches these local samples into a global plan. The local samplers come in discriminative (classification of optimal samples) and generative (CVAE-based sampling) variants and are trained offline from PRM*-generated ground truth; the global planner preserves probabilistic completeness and asymptotic optimality. Across Gibson-simulated and real Fetch robot experiments, NRPs outperform classical and learning-enhanced SBMP methods in success rates and path quality, and demonstrate zero-shot transfer to novel environments without fine-tuning. The work shows that local, locally-conditioned learning integrated with planning yields data-efficient, scalable, and robust whole-body motion in realistic environments, with future directions including fast re-planning and more expressive generative models.

Abstract

Robot motion planning has made vast advances over the past decades, but the challenge remains: robot mobile manipulators struggle to plan long-range whole-body motion in common household environments in real time, because of high-dimensional robot configuration space and complex environment geometry. To tackle the challenge, this paper proposes Neural Randomized Planner (NRP), which combines a global sampling-based motion planning (SBMP) algorithm and a local neural sampler. Intuitively, NRP uses the search structure inside the global planner to stitch together learned local sampling distributions to form a global sampling distribution adaptively. It benefits from both learning and planning. Locally, it tackles high dimensionality by learning to sample in promising regions from data, with a rich neural network representation. Globally, it composes the local sampling distributions through planning and exploits local geometric similarity to scale up to complex environments. Experiments both in simulation and on a real robot show \NRP yields superior performance compared to some of the best classical and learning-enhanced SBMP algorithms. Further, despite being trained in simulation, NRP demonstrates zero-shot transfer to a real robot operating in novel household environments, without any fine-tuning or manual adaptation.

Neural Randomized Planning for Whole Body Robot Motion

TL;DR

The paper tackles the challenge of real-time, long-range whole-body motion planning for high-DOF robots in cluttered homes. It introduces Neural Randomized Planner (NRP), which combines a global SBMP backbone (RRT/IRRT*-based) with a locally learned neural sampler to generate optimal waypoints conditioned on local surroundings, then stitches these local samples into a global plan. The local samplers come in discriminative (classification of optimal samples) and generative (CVAE-based sampling) variants and are trained offline from PRM*-generated ground truth; the global planner preserves probabilistic completeness and asymptotic optimality. Across Gibson-simulated and real Fetch robot experiments, NRPs outperform classical and learning-enhanced SBMP methods in success rates and path quality, and demonstrate zero-shot transfer to novel environments without fine-tuning. The work shows that local, locally-conditioned learning integrated with planning yields data-efficient, scalable, and robust whole-body motion in realistic environments, with future directions including fast re-planning and more expressive generative models.

Abstract

Robot motion planning has made vast advances over the past decades, but the challenge remains: robot mobile manipulators struggle to plan long-range whole-body motion in common household environments in real time, because of high-dimensional robot configuration space and complex environment geometry. To tackle the challenge, this paper proposes Neural Randomized Planner (NRP), which combines a global sampling-based motion planning (SBMP) algorithm and a local neural sampler. Intuitively, NRP uses the search structure inside the global planner to stitch together learned local sampling distributions to form a global sampling distribution adaptively. It benefits from both learning and planning. Locally, it tackles high dimensionality by learning to sample in promising regions from data, with a rich neural network representation. Globally, it composes the local sampling distributions through planning and exploits local geometric similarity to scale up to complex environments. Experiments both in simulation and on a real robot show \NRP yields superior performance compared to some of the best classical and learning-enhanced SBMP algorithms. Further, despite being trained in simulation, NRP demonstrates zero-shot transfer to a real robot operating in novel household environments, without any fine-tuning or manual adaptation.
Paper Structure (30 sections, 4 equations, 19 figures, 4 tables)

This paper contains 30 sections, 4 equations, 19 figures, 4 tables.

Figures (19)

  • Figure 1: A Fetch robot executing the whole-body motion planed by NRP, with its end-effector trajectory shown in yellow. The task is to move a teddy bear on the ground to the kitchen shelf, requiring careful base and arm coordination for efficient obstacle avoidance and task accomplishment.
  • Figure 2: NRP Online Planning Process. (a) Initial planning tree with start configuration $q_{s}\xspace$ and goal $q_{g}\xspace$. (b) An expansion target $q_{target}\xspace$ is sampled and the nearest vertex in the tree $q_{cur}\xspace$ is selected for expansion. (c) The neural sampler uses $q_{cur}\xspace$, $q_{target}\xspace$, and the local environment around $q_{cur}\xspace$ to generate an optimal waypoint $q^{*}\xspace\xspace$ for expansion. (d) The expansion path passing through $q^{*}\xspace\xspace$, upon passing collision checks, is added to the tree. This sequence repeats until a solution is discovered or a maximum planning time is reached.
  • Figure 3: NRP's Offline Learning Process. Training data is collected by sampling numerous planning queries, i.e., start and goal configurations, within training environments. For each query, the ground truth optimal waypoint, denoted as $q^{*}\xspace\xspace$, is extracted from the roadmap pre-computed by an expert motion planner. This waypoint serves as supervision for both local neural samplers. Specifically, the generative sampler learns to reconstruct optimal waypoint from latent-space samples, while the discriminative sampler learns to classify optimal waypoints from random samples.
  • Figure 4: Overview of discriminative sampler. Q denotes a set of uniformly sampled configurations within $W^{local}\xspace$.
  • Figure 5: Overview of generative sampler. $q^{*}\xspace$ denotes the optimal waypoint. We only use the decoder at inference time.
  • ...and 14 more figures