Effective and Robust Non-Prehensile Manipulation via Persistent Homology Guided Monte-Carlo Tree Search
Ewerton R. Vieira, Kai Gao, Daniel Nakhimovich, Kostas E. Bekris, Jingjin Yu
TL;DR
PHIM addresses object retrieval in cluttered environments by combining persistent homology with Monte-Carlo Tree Search to plan long-horizon non-prehensile pushing. The approach automatically identifies blocking object clusters within the feasible path region and guides pushing actions using a cluster-dispersion–enhanced reward, enabling robust planning under uncertainty. Real-world Baxter robot experiments and extensive simulations show PHIM achieves higher success rates and fewer actions than baselines, demonstrating robustness to pose and actuation noise and the practicality of offline planning that transfers to online execution. This topology-informed planning framework offers a scalable solution for clutter clearance and can be extended to denser scenes and integrated with pick-and-place strategies.
Abstract
Performing object retrieval in real-world workspaces must tackle challenges including \emph{uncertainty} and \emph{clutter}. One option is to apply prehensile operations, which can be time consuming in highly-cluttered scenarios. On the other hand, non-prehensile actions, such as pushing simultaneously multiple objects, can help to quickly clear a cluttered workspace and retrieve a target object. Such actions, however, can also lead to increased uncertainty as it is difficult to estimate the outcome of pushing operations. The proposed framework in this work integrates topological tools and Monte-Carlo Tree Search (MCTS) to achieve effective and robust pushing for object retrieval. It employs persistent homology to automatically identify manageable clusters of blocking objects without the need for manually adjusting hyper-parameters. Then, MCTS uses this information to explore feasible actions to push groups of objects, aiming to minimize the number of operations needed to clear the path to the target. Real-world experiments using a Baxter robot, which involves some noise in actuation, show that the proposed framework achieves a higher success rate in solving retrieval tasks in dense clutter than alternatives. Moreover, it produces solutions with few pushing actions improving the overall execution time. More critically, it is robust enough that it allows one to plan the sequence of actions offline and then execute them reliably on a Baxter robot.
