Neural Rearrangement Planning for Object Retrieval from Confined Spaces Perceivable by Robot's In-hand RGB-D Sensor

Hanwen Ren; Ahmed H. Qureshi

Neural Rearrangement Planning for Object Retrieval from Confined Spaces Perceivable by Robot's In-hand RGB-D Sensor

Hanwen Ren, Ahmed H. Qureshi

TL;DR

The paper tackles retrieving a target object $o^*$ from unknown confined spaces using an in-hand RGB-D perceptual pipeline and a neural rearrangement planning framework. It introduces two neural modules, Object Selection Network (OSNet) to identify blockers and Region Proposal Network (RPNet) to propose relocation regions that preserve path homotopy to the target, enabling fast, learned planning. The approach integrates active sensing, GraspNet-driven grasp poses, and reachability checks (RRT-Connect) with an iterative rearrangement loop until the target becomes reachable. Empirically, the neural planner substantially outperforms baselines in success rate and planning speed (notably orders of magnitude faster) and demonstrates sim-to-real transfer in cabinet-like real-world scenes.

Abstract

Rearrangement planning for object retrieval tasks from confined spaces is a challenging problem, primarily due to the lack of open space for robot motion and limited perception. Several traditional methods exist to solve object retrieval tasks, but they require overhead cameras for perception and a time-consuming exhaustive search to find a solution and often make unrealistic assumptions, such as having identical, simple geometry objects in the environment. This paper presents a neural object retrieval framework that efficiently performs rearrangement planning of unknown, arbitrary objects in confined spaces to retrieve the desired object using a given robot grasp. Our method actively senses the environment with the robot's in-hand camera. It then selects and relocates the non-target objects such that they do not block the robot path homotopy to the target object, thus also aiding an underlying path planner in quickly finding robot motion sequences. Furthermore, we demonstrate our framework in challenging scenarios, including real-world cabinet-like environments with arbitrary household objects. The results show that our framework achieves the best performance among all presented methods and is, on average, two orders of magnitude computationally faster than the best-performing baselines.

Neural Rearrangement Planning for Object Retrieval from Confined Spaces Perceivable by Robot's In-hand RGB-D Sensor

TL;DR

The paper tackles retrieving a target object

from unknown confined spaces using an in-hand RGB-D perceptual pipeline and a neural rearrangement planning framework. It introduces two neural modules, Object Selection Network (OSNet) to identify blockers and Region Proposal Network (RPNet) to propose relocation regions that preserve path homotopy to the target, enabling fast, learned planning. The approach integrates active sensing, GraspNet-driven grasp poses, and reachability checks (RRT-Connect) with an iterative rearrangement loop until the target becomes reachable. Empirically, the neural planner substantially outperforms baselines in success rate and planning speed (notably orders of magnitude faster) and demonstrates sim-to-real transfer in cabinet-like real-world scenes.

Abstract

Paper Structure (18 sections, 4 equations, 3 figures, 1 table)

This paper contains 18 sections, 4 equations, 3 figures, 1 table.

Introduction
Related Work
Proposed Method
Problem Definition
Scene observation
Neural Object Selection
Neural Rearrangement Region Proposal
Full Pipeline Algorithm
Results & Discussions
Baselines
Success rate
Planning time
number of object rearranged $\&$ moving distance
Ablation Studies
OSNet-only Planner
...and 3 more sections

Figures (3)

Figure 1: Execution for retrieving the target object ("plum"): The robot's pathway to the target object is blocked in the initial setup. After executing the object manipulation plan from our method of relocating the pathway-blocking objects, in this case, the cylinders, the robot arm finally retrieves the plum from the confined cabinet environment.
Figure 2: Neural Object Retrieval: The task is to retrieve the yellow object (Banana). Our main modules include the object selection and region proposal network. We do not show MLPs that encode the given inputs for brevity. Given the scene observation via active sensing, the object selection network selects the non-target object for rearrangement. The chosen object $o'$ is indicated in red in the bottom scene image. The region proposal network proposes the best placement region for the selected object to clear the pathway for target object retrieval. The proposed placement region on the environment surface is marked red on the top right scene image. Our robot moves the selected object to its new placement and retrieves the target object if possible otherwise repeats the object rearrangement process in the confined environment.
Figure 3: Execution for retrieving the yellow target object ("banana"): In the initial setup, the target object is not retrievable as other objects block it. The robot clears the pathway by moving two cylindrical objects (frames 1-4 ) and then finally take the pathway going through the back of all objects to the target object. It can also be seen that confined spaces impose significant challenges in robot motion, especially when retrieving an object with a relatively lower height, such as a banana, than other objects.

Neural Rearrangement Planning for Object Retrieval from Confined Spaces Perceivable by Robot's In-hand RGB-D Sensor

TL;DR

Abstract

Neural Rearrangement Planning for Object Retrieval from Confined Spaces Perceivable by Robot's In-hand RGB-D Sensor

Authors

TL;DR

Abstract

Table of Contents

Figures (3)