Table of Contents
Fetching ...

RoboArm-NMP: a Learning Environment for Neural Motion Planning

Tom Jurgenson, Matan Sudry, Gal Avineri, Aviv Tamar

TL;DR

RoboArm-NMP introduces a unified learning and evaluation environment for neural motion planning in a 7-DOF robot arm, integrating PyBullet simulation, demonstrations, and multiple obstacle-encoding schemes. The study systematically compares IL and RL approaches, assessing goal-generalization, obstacle-generalization, and inference speed, and finds that combining demonstrations with hindsight improves learning while obstacle generalization remains problematic. Among encoders, VQ-VAE provides the best obstacle encodings, yet even the best policies often underperform a simple go-to-goal baseline in cluttered and variable obstacle settings. The work highlights the need for more robust scene representations and generalization strategies, and positions RoboArm-NMP as a scalable platform for advancing NMP research.

Abstract

We present RoboArm-NMP, a learning and evaluation environment that allows simple and thorough evaluations of Neural Motion Planning (NMP) algorithms, focused on robotic manipulators. Our Python-based environment provides baseline implementations for learning control policies (either supervised or reinforcement learning based), a simulator based on PyBullet, data of solved instances using a classical motion planning solver, various representation learning methods for encoding the obstacles, and a clean interface between the learning and planning frameworks. Using RoboArm-NMP, we compare several prominent NMP design points, and demonstrate that the best methods mostly succeed in generalizing to unseen goals in a scene with fixed obstacles, but have difficulty in generalizing to unseen obstacle configurations, suggesting focus points for future research.

RoboArm-NMP: a Learning Environment for Neural Motion Planning

TL;DR

RoboArm-NMP introduces a unified learning and evaluation environment for neural motion planning in a 7-DOF robot arm, integrating PyBullet simulation, demonstrations, and multiple obstacle-encoding schemes. The study systematically compares IL and RL approaches, assessing goal-generalization, obstacle-generalization, and inference speed, and finds that combining demonstrations with hindsight improves learning while obstacle generalization remains problematic. Among encoders, VQ-VAE provides the best obstacle encodings, yet even the best policies often underperform a simple go-to-goal baseline in cluttered and variable obstacle settings. The work highlights the need for more robust scene representations and generalization strategies, and positions RoboArm-NMP as a scalable platform for advancing NMP research.

Abstract

We present RoboArm-NMP, a learning and evaluation environment that allows simple and thorough evaluations of Neural Motion Planning (NMP) algorithms, focused on robotic manipulators. Our Python-based environment provides baseline implementations for learning control policies (either supervised or reinforcement learning based), a simulator based on PyBullet, data of solved instances using a classical motion planning solver, various representation learning methods for encoding the obstacles, and a clean interface between the learning and planning frameworks. Using RoboArm-NMP, we compare several prominent NMP design points, and demonstrate that the best methods mostly succeed in generalizing to unseen goals in a scene with fixed obstacles, but have difficulty in generalizing to unseen obstacle configurations, suggesting focus points for future research.
Paper Structure (22 sections, 10 figures, 8 tables)

This paper contains 22 sections, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Example of RoboArm-NMP tasks: from the left, double-walls -- a narrow gap scenario with fixed obstacles, two samples of random boxes hard demonstrating challenging narrow passages in our train data, and two test (OOD) tasks, narrow shelves, and benchmaker bookshelf tall (ported from chamzas2021motionbenchmaker). See Section \ref{['sec:benchmark-description']} for full tasks description.
  • Figure 2: Schematic diagram of the RoboArm-NMP components.
  • Figure 3: we compare the success rate of different algorithms (X axis) with different goal representations (colors). (a) No obstacles task (b) Double walls wide gap task (c) Double walls task.
  • Figure 4: Random boxes easy example queries. The start configuration, goal state and obstacles (2-4) are sampled randomly. The robot is at the starting state and the green sphere represents the end-effector goal position.
  • Figure 5: Random boxes medium example queries. The start configuration, goal state and obstacles (3-6) are sampled randomly. The robot is at the starting state and the green sphere represents the end-effector goal position.
  • ...and 5 more figures