Table of Contents
Fetching ...

PushingBots: Collaborative Pushing via Neural Accelerated Combinatorial Hybrid Optimization

Zili Tang, Ying Zhang, Meng Guo

TL;DR

This work tackles collaborative pushing of multiple arbitrary objects by a fleet of robots in cluttered environments. It formulates the problem as a combinatorial-hybrid optimization (CHO) and introduces a three-layer framework: MAPF-based task decomposition with dynamic subteam assignment, a keyframe-guided hybrid search (KGHS) for efficient mode sequencing, and online diffusion-accelerated execution with adaptation. The method is augmented with a diffusion-based predictor trained offline to propose promising keyframes and pushing modes, while maintaining verification for feasibility, yielding completeness guarantees under mild assumptions. Extensive simulations and hardware experiments demonstrate scalability to many objects and heterogeneous robots, generalization to 6D pushing, and robustness to perturbations, outperforming multiple baselines in both single- and multi-object scenarios. The approach offers a practical, theoretically grounded solution for large-scale cooperative pushing with potential extensions to distributed control and dynamical obstacles.

Abstract

Many robots are not equipped with a manipulator and many objects are not suitable for prehensile manipulation (such as large boxes and cylinders). In these cases, pushing is a simple yet effective non-prehensile skill for robots to interact with and further change the environment. Existing work often assumes a set of predefined pushing modes and fixed-shape objects. This work tackles the general problem of controlling a robotic fleet to push collaboratively numerous arbitrary objects to respective destinations, within complex environments of cluttered and movable obstacles. It incorporates several characteristic challenges for multi-robot systems such as online task coordination under large uncertainties of cost and duration, and for contact-rich tasks such as hybrid switching among different contact modes, and under-actuation due to constrained contact forces. The proposed method is based on combinatorial hybrid optimization over dynamic task assignments and hybrid execution via sequences of pushing modes and associated forces. It consists of three main components: (I) the decomposition, ordering and rolling assignment of pushing subtasks to robot subgroups; (II) the keyframe guided hybrid search to optimize the sequence of parameterized pushing modes for each subtask; (III) the hybrid control to execute these modes and transit among them. Last but not least, a diffusion-based accelerator is adopted to predict the keyframes and pushing modes that should be prioritized during hybrid search; and further improve planning efficiency. The framework is complete under mild assumptions. Its efficiency and effectiveness under different numbers of robots and general-shaped objects are validated extensively in simulations and hardware experiments, as well as generalizations to heterogeneous robots, planar assembly and 6D pushing.

PushingBots: Collaborative Pushing via Neural Accelerated Combinatorial Hybrid Optimization

TL;DR

This work tackles collaborative pushing of multiple arbitrary objects by a fleet of robots in cluttered environments. It formulates the problem as a combinatorial-hybrid optimization (CHO) and introduces a three-layer framework: MAPF-based task decomposition with dynamic subteam assignment, a keyframe-guided hybrid search (KGHS) for efficient mode sequencing, and online diffusion-accelerated execution with adaptation. The method is augmented with a diffusion-based predictor trained offline to propose promising keyframes and pushing modes, while maintaining verification for feasibility, yielding completeness guarantees under mild assumptions. Extensive simulations and hardware experiments demonstrate scalability to many objects and heterogeneous robots, generalization to 6D pushing, and robustness to perturbations, outperforming multiple baselines in both single- and multi-object scenarios. The approach offers a practical, theoretically grounded solution for large-scale cooperative pushing with potential extensions to distributed control and dynamical obstacles.

Abstract

Many robots are not equipped with a manipulator and many objects are not suitable for prehensile manipulation (such as large boxes and cylinders). In these cases, pushing is a simple yet effective non-prehensile skill for robots to interact with and further change the environment. Existing work often assumes a set of predefined pushing modes and fixed-shape objects. This work tackles the general problem of controlling a robotic fleet to push collaboratively numerous arbitrary objects to respective destinations, within complex environments of cluttered and movable obstacles. It incorporates several characteristic challenges for multi-robot systems such as online task coordination under large uncertainties of cost and duration, and for contact-rich tasks such as hybrid switching among different contact modes, and under-actuation due to constrained contact forces. The proposed method is based on combinatorial hybrid optimization over dynamic task assignments and hybrid execution via sequences of pushing modes and associated forces. It consists of three main components: (I) the decomposition, ordering and rolling assignment of pushing subtasks to robot subgroups; (II) the keyframe guided hybrid search to optimize the sequence of parameterized pushing modes for each subtask; (III) the hybrid control to execute these modes and transit among them. Last but not least, a diffusion-based accelerator is adopted to predict the keyframes and pushing modes that should be prioritized during hybrid search; and further improve planning efficiency. The framework is complete under mild assumptions. Its efficiency and effectiveness under different numbers of robots and general-shaped objects are validated extensively in simulations and hardware experiments, as well as generalizations to heterogeneous robots, planar assembly and 6D pushing.

Paper Structure

This paper contains 42 sections, 4 theorems, 22 equations, 24 figures, 6 tables, 2 algorithms.

Key Result

Lemma 1

Alg. alg:segments terminates in finite steps and generates a strict partial ordering. $\blacksquare$

Figures (24)

  • Figure 1: Snapshots of the PushingBots system, during the simulated planar assembly task via $12$ robots and $14$ objects (Left); and the hardware experiments of swapping $2$ objects via $4$ mini-ground vehicles (Right).
  • Figure 2: Overall proposed online planning and adaptation framework, consisting of the MAPF-based task decomposition, subtask generation with partial ordering, the online rolling assignment of subtasks, the neural accelerated hybrid optimization, and the online hybrid control.
  • Figure 3: Illustration of the decomposition and ordering of pushing tasks via Alg. \ref{['alg:segments']}, yielding $3$ subtasks $\{\mathfrak{S}^1_1,\mathfrak{S}^2_1,\mathfrak{S}^1_2\}$ with ordering $\mathfrak{S}^1_2\preceq \mathfrak{S}^2_1$ and $\mathfrak{S}^1_1 \preceq \mathfrak{S}^2_1$. Note that object $m_1$ waits for object $m_2$ to pass the corridor.
  • Figure 4: Selection and expansion during the proposed receding-horizon assignment of the 12 subtasks and 7 robots, given the current planning time (red dashed line) and the horizon $H=5$ (blue dashed line).
  • Figure 5: Illustration of the keyframe-guided hybrid search algorithm (Top), which is accelerated by the diffusion-based predictor for keyframes and pushing modes (Bottom). Note that the multi-modal predictions are verified by the hybrid search scheme for feasibility and quality.
  • ...and 19 more figures

Theorems & Definitions (17)

  • Remark 1
  • Remark 2
  • Definition 1: Path Segments
  • Definition 2: Partial Ordering
  • Definition 3: Task plan
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Remark 3
  • Remark 4
  • ...and 7 more