IMPACT: Intelligent Motion Planning with Acceptable Contact Trajectories via Vision-Language Models
Yiyang Ling, Karan Owalekar, Oluwatobiloba Adesanya, Erdem Bıyık, Daniel Seita
TL;DR
IMPACT tackles the challenge of performing robot manipulation in densely cluttered environments by allowing semantically acceptable contact. It leverages Vision-Language Models to infer per-object contact tolerances from scene images and uses these costs to build an anisotropic, directional safety map that guides a three-pronged motion primitive planner (Move, Rotate, Push). A contact-aware A* search then yields trajectories that minimize risk while enabling efficient contact with environmental objects when needed, and the approach is validated across extensive simulation and real-world experiments, including human judgments. The results show improved success rates, reduced contact duration, and trajectories that align better with human preferences, highlighting the practical potential of semantically informed, contact-tolerant planning for dense clutter scenarios.
Abstract
Motion planning involves determining a sequence of robot configurations to reach a desired pose, subject to movement and safety constraints. Traditional motion planning finds collision-free paths, but this is overly restrictive in clutter, where it may not be possible for a robot to accomplish a task without contact. In addition, contacts range from relatively benign (e.g. brushing a soft pillow) to more dangerous (e.g. toppling a glass vase), making it difficult to characterize which may be acceptable. In this paper, we propose IMPACT, a novel motion planning framework that uses Vision-Language Models (VLMs) to infer environment semantics, identifying which parts of the environment can best tolerate contact based on object properties and locations. Our approach generates an anisotropic cost map that encodes directional push safety. We pair this map with a contact-aware A* planner to find stable contact-rich paths. We perform experiments using 20 simulation and 10 real-world scenes and assess using task success rate, object displacements, and feedback from human evaluators. Our results over 3200 simulation and 200 real-world trials suggest that IMPACT enables efficient contact-rich motion planning in cluttered settings while outperforming alternative methods and ablations. Our project website is available at https://impact-planning.github.io/.
