PushingBots: Collaborative Pushing via Neural Accelerated Combinatorial Hybrid Optimization
Zili Tang, Ying Zhang, Meng Guo
TL;DR
This work tackles collaborative pushing of multiple arbitrary objects by a fleet of robots in cluttered environments. It formulates the problem as a combinatorial-hybrid optimization (CHO) and introduces a three-layer framework: MAPF-based task decomposition with dynamic subteam assignment, a keyframe-guided hybrid search (KGHS) for efficient mode sequencing, and online diffusion-accelerated execution with adaptation. The method is augmented with a diffusion-based predictor trained offline to propose promising keyframes and pushing modes, while maintaining verification for feasibility, yielding completeness guarantees under mild assumptions. Extensive simulations and hardware experiments demonstrate scalability to many objects and heterogeneous robots, generalization to 6D pushing, and robustness to perturbations, outperforming multiple baselines in both single- and multi-object scenarios. The approach offers a practical, theoretically grounded solution for large-scale cooperative pushing with potential extensions to distributed control and dynamical obstacles.
Abstract
Many robots are not equipped with a manipulator and many objects are not suitable for prehensile manipulation (such as large boxes and cylinders). In these cases, pushing is a simple yet effective non-prehensile skill for robots to interact with and further change the environment. Existing work often assumes a set of predefined pushing modes and fixed-shape objects. This work tackles the general problem of controlling a robotic fleet to push collaboratively numerous arbitrary objects to respective destinations, within complex environments of cluttered and movable obstacles. It incorporates several characteristic challenges for multi-robot systems such as online task coordination under large uncertainties of cost and duration, and for contact-rich tasks such as hybrid switching among different contact modes, and under-actuation due to constrained contact forces. The proposed method is based on combinatorial hybrid optimization over dynamic task assignments and hybrid execution via sequences of pushing modes and associated forces. It consists of three main components: (I) the decomposition, ordering and rolling assignment of pushing subtasks to robot subgroups; (II) the keyframe guided hybrid search to optimize the sequence of parameterized pushing modes for each subtask; (III) the hybrid control to execute these modes and transit among them. Last but not least, a diffusion-based accelerator is adopted to predict the keyframes and pushing modes that should be prioritized during hybrid search; and further improve planning efficiency. The framework is complete under mild assumptions. Its efficiency and effectiveness under different numbers of robots and general-shaped objects are validated extensively in simulations and hardware experiments, as well as generalizations to heterogeneous robots, planar assembly and 6D pushing.
