ConPoSe: LLM-Guided Contact Point Selection for Scalable Cooperative Object Pushing
Noah Steinkrüger, Nisarga Nilavadi, Wolfram Burgard, Tanja Katharina Kaiser
TL;DR
This paper tackles scalable cooperative object pushing by multiple nonholonomic robots with limited object knowledge. It introduces ConPoSe, an LLM-guided local search framework that selects contact points by prompting an LLM about the target pushing direction and refining the result through neighborhood search. The approach achieves strong time scalability and high success rates across various object shapes and robot counts in simulation, outperforming a purely LLM-based method and matching or exceeding analytical baselines in many settings. A key finding is that contact-point switching is the main bottleneck, guiding future work toward more robust switching strategies and real-world validation.
Abstract
Object transportation in cluttered environments is a fundamental task in various domains, including domestic service and warehouse logistics. In cooperative object transport, multiple robots must coordinate to move objects that are too large for a single robot. One transport strategy is pushing, which only requires simple robots. However, careful selection of robot-object contact points is necessary to push the object along a preplanned path. Although this selection can be solved analytically, the solution space grows combinatorially with the number of robots and object size, limiting scalability. Inspired by how humans rely on common-sense reasoning for cooperative transport, we propose combining the reasoning capabilities of Large Language Models with local search to select suitable contact points. Our LLM-guided local search method for contact point selection, ConPoSe, successfully selects contact points for a variety of shapes, including cuboids, cylinders, and T-shapes. We demonstrate that ConPoSe scales better with the number of robots and object size than the analytical approach, and also outperforms pure LLM-based selection.
