Neural Randomized Planning for Whole Body Robot Motion

Yunfan Lu; Yuchen Ma; David Hsu; Panpan Cai

Neural Randomized Planning for Whole Body Robot Motion

Yunfan Lu, Yuchen Ma, David Hsu, Panpan Cai

TL;DR

The paper tackles the challenge of real-time, long-range whole-body motion planning for high-DOF robots in cluttered homes. It introduces Neural Randomized Planner (NRP), which combines a global SBMP backbone (RRT/IRRT*-based) with a locally learned neural sampler to generate optimal waypoints conditioned on local surroundings, then stitches these local samples into a global plan. The local samplers come in discriminative (classification of optimal samples) and generative (CVAE-based sampling) variants and are trained offline from PRM*-generated ground truth; the global planner preserves probabilistic completeness and asymptotic optimality. Across Gibson-simulated and real Fetch robot experiments, NRPs outperform classical and learning-enhanced SBMP methods in success rates and path quality, and demonstrate zero-shot transfer to novel environments without fine-tuning. The work shows that local, locally-conditioned learning integrated with planning yields data-efficient, scalable, and robust whole-body motion in realistic environments, with future directions including fast re-planning and more expressive generative models.

Abstract

Robot motion planning has made vast advances over the past decades, but the challenge remains: robot mobile manipulators struggle to plan long-range whole-body motion in common household environments in real time, because of high-dimensional robot configuration space and complex environment geometry. To tackle the challenge, this paper proposes Neural Randomized Planner (NRP), which combines a global sampling-based motion planning (SBMP) algorithm and a local neural sampler. Intuitively, NRP uses the search structure inside the global planner to stitch together learned local sampling distributions to form a global sampling distribution adaptively. It benefits from both learning and planning. Locally, it tackles high dimensionality by learning to sample in promising regions from data, with a rich neural network representation. Globally, it composes the local sampling distributions through planning and exploits local geometric similarity to scale up to complex environments. Experiments both in simulation and on a real robot show \NRP yields superior performance compared to some of the best classical and learning-enhanced SBMP algorithms. Further, despite being trained in simulation, NRP demonstrates zero-shot transfer to a real robot operating in novel household environments, without any fine-tuning or manual adaptation.

Neural Randomized Planning for Whole Body Robot Motion

TL;DR

Abstract

Paper Structure (30 sections, 4 equations, 19 figures, 4 tables)

This paper contains 30 sections, 4 equations, 19 figures, 4 tables.

Introduction
Related work
Generating Whole-Body Robot Motions
Learning Sampling Distributions
Neural Randomized Planning
Global Motion Planner
Local Neural Sampler
Discriminative Sampler
Generative Sampler
Training Dataset
Simulation Experiments
Experimental Setup in Simulation
Evaluation Tasks
Variants and Comparison Baselines
Implementation Details
...and 15 more sections

Figures (19)

Figure 1: A Fetch robot executing the whole-body motion planed by NRP, with its end-effector trajectory shown in yellow. The task is to move a teddy bear on the ground to the kitchen shelf, requiring careful base and arm coordination for efficient obstacle avoidance and task accomplishment.
Figure 2: NRP Online Planning Process. (a) Initial planning tree with start configuration $q_{s}\xspace$ and goal $q_{g}\xspace$. (b) An expansion target $q_{target}\xspace$ is sampled and the nearest vertex in the tree $q_{cur}\xspace$ is selected for expansion. (c) The neural sampler uses $q_{cur}\xspace$, $q_{target}\xspace$, and the local environment around $q_{cur}\xspace$ to generate an optimal waypoint $q^{*}\xspace\xspace$ for expansion. (d) The expansion path passing through $q^{*}\xspace\xspace$, upon passing collision checks, is added to the tree. This sequence repeats until a solution is discovered or a maximum planning time is reached.
Figure 3: NRP's Offline Learning Process. Training data is collected by sampling numerous planning queries, i.e., start and goal configurations, within training environments. For each query, the ground truth optimal waypoint, denoted as $q^{*}\xspace\xspace$, is extracted from the roadmap pre-computed by an expert motion planner. This waypoint serves as supervision for both local neural samplers. Specifically, the generative sampler learns to reconstruct optimal waypoint from latent-space samples, while the discriminative sampler learns to classify optimal waypoints from random samples.
Figure 4: Overview of discriminative sampler. Q denotes a set of uniformly sampled configurations within $W^{local}\xspace$.
Figure 5: Overview of generative sampler. $q^{*}\xspace$ denotes the optimal waypoint. We only use the decoder at inference time.
...and 14 more figures

Neural Randomized Planning for Whole Body Robot Motion

TL;DR

Abstract

Neural Randomized Planning for Whole Body Robot Motion

Authors

TL;DR

Abstract

Table of Contents

Figures (19)