Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters

Yitong Li; Ruihai Wu; Haoran Lu; Chuanruo Ning; Yan Shen; Guanqi Zhan; Hao Dong

Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters

Yitong Li, Ruihai Wu, Haoran Lu, Chuanruo Ning, Yan Shen, Guanqi Zhan, Hao Dong

TL;DR

This work addresses safe retrieval of a target object from clutter by modeling inter-object support as a graph built through recursive broadcasting of accurate local dynamics. The framework combines a Retrieval Direction Predictor and a Local Dynamics Predictor to construct a target-centered Support Graph $\mathcal{G}_s$, with a Clutter Solver and Graph Adjustment ensuring robust refinement under occlusions. A Manipulation Affordance Predictor estimates optimal grasp points and poses, enabling safe, stepwise removal of obstructing objects before accessing the target. Empirical results in both large-scale simulations and real-world setups demonstrate superior retrieval success, lower displacement, and fewer steps compared with baselines, highlighting the practical impact for robotic manipulation in complex cluttered environments.

Abstract

In our daily life, cluttered objects are everywhere, from scattered stationery and books cluttering the table to bowls and plates filling the kitchen sink. Retrieving a target object from clutters is an essential while challenging skill for robots, for the difficulty of safely manipulating an object without disturbing others, which requires the robot to plan a manipulation sequence and first move away a few other objects supported by the target object step by step. However, due to the diversity of object configurations (e.g., categories, geometries, locations and poses) and their combinations in clutters, it is difficult for a robot to accurately infer the support relations between objects faraway with various objects in between. In this paper, we study retrieving objects in complicated clutters via a novel method of recursively broadcasting the accurate local dynamics to build a support relation graph of the whole scene, which largely reduces the complexity of the support relation inference and improves the accuracy. Experiments in both simulation and the real world demonstrate the efficiency and effectiveness of our method.

Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters

TL;DR

, with a Clutter Solver and Graph Adjustment ensuring robust refinement under occlusions. A Manipulation Affordance Predictor estimates optimal grasp points and poses, enabling safe, stepwise removal of obstructing objects before accessing the target. Empirical results in both large-scale simulations and real-world setups demonstrate superior retrieval success, lower displacement, and fewer steps compared with baselines, highlighting the practical impact for robotic manipulation in complex cluttered environments.

Abstract

Paper Structure (19 sections, 5 equations, 7 figures, 6 tables)

This paper contains 19 sections, 5 equations, 7 figures, 6 tables.

Introduction
Related Work
Support Relations Inference
Cluttered Objects Manipulation
Dynamics Models for Robotic Manipulation
Problem Formulation
Method
Motivation and Overview
General Idea for Support Graph Generation
Retrieval Direction Predictor
Local Dynamics Predictor
Clutter Solver: Recursive Support Relation Broadcasting
Manipulation Affordance Predictor
Experiments
Setup
...and 4 more sections

Figures (7)

Figure 1: Our Proposed Framework broadcasts the support relations recursively from the target object using local dynamics between adjacent objects, and uses the support relation graph to efficiently guide the step-by-step target object retrieval.
Figure 2: Our Proposed Framework. The first row shows the Recursive Broadcasting process of support relations via local dynamics. To infer the local dynamics starting from an object $O_i$, our framework first selects the optimal retrieval direction using the Direction Scoring Module from the direction candidates proposed by the Direction Proposal Module. With the optimal retrieval direction, the Dynamics Predictor predicts the support relations between each object adjacent to $O_i$.
Figure 3: Graph Adjustment when the occlusion pink box is removed and the system re-broadcasts the supporting relations from the mug to be retrieved (target object).
Figure 4: Affordance Scoring Module. To estimate the affordance score for a grasp point, we first calculate the preliminary affordance score solely based on the point itself, and then we evaluate the influence score by estimating the potential impact. The final affordance score is obtained by subtracting this influence score from the preliminary score.
Figure 5: Manipulation Sequence for Real-World Clutters with Captured Point Clouds. We show the 4 cases respectively demonstrating the desk, food, sundries and kitchen scenarios. The second case contains occlusion removal and thus executes the Graph Adjustment process after moving away the white box in column 3.
...and 2 more figures

Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters

TL;DR

Abstract

Broadcasting Support Relations Recursively from Local Dynamics for Object Retrieval in Clutters

Authors

TL;DR

Abstract

Table of Contents

Figures (7)