Table of Contents
Fetching ...

Look as You Leap: Planning Simultaneous Motion and Perception for High-DOF Robots

Qingxi Meng, Emiliano Flores, Carlos Quintero-Peña, Peizhu Qian, Zachary Kingston, Shannan K. Hamlin, Vaibhav Unhelkar, Lydia E. Kavraki

TL;DR

PS-PRM introduces a perception-score-guided, GPU-accelerated PRM for high-DOF robots that jointly optimizes motion and perception by using a neural surrogate to estimate perception quality from SE(3) camera poses. The method integrates visibility constraints, occlusion-aware ray casting, and batch GPU processing to enable real-time replanning in dynamic environments, with probabilistic completeness preserved. Empirical results across simulation and real robotic platforms show consistent improvements in perception performance and planning efficiency, with neural surrogates delivering up to ~10× preprocessing speedups while maintaining comparable detection accuracy. The work demonstrates practical benefits for perception-critical tasks in human-centered settings, such as nursing and home environments, and outlines future directions in domain adaptation, uncertainty modeling, and broader perception tasks.

Abstract

Most common tasks for robots in dynamic spaces require that the environment is regularly and actively perceived, with many of them explicitly requiring objects or persons to be within view, i.e., for monitoring or safety. However, solving motion and perception tasks simultaneously is challenging, as these objectives often impose conflicting requirements. Furthermore, while robots must react quickly to changes in the environment, directly evaluating the quality of perception (e.g., object detection confidence) is often expensive or infeasible at runtime. This problem is especially important in human-centered environments, such as homes and hospitals, where effective perception is essential for safe and reliable operation. In this work, we address the challenge of solving motion planning problems for high-degree-of-freedom (DoF) robots from a start to a goal configuration with continuous perception constraints under both static and dynamic environments. We propose a GPU-parallelized perception-score-guided probabilistic roadmap planner with a neural surrogate model (PS-PRM). Unlike existing active perception-, visibility-aware or learning-based planners, our work integrates perception tasks and constraints directly into the motion planning formulation. Our method uses a neural surrogate model to approximate perception scores, incorporates them into the roadmap, and leverages GPU parallelism to enable efficient online replanning in dynamic settings. We demonstrate that our planner, evaluated on high-DoF robots, outperforms baseline methods in both static and dynamic environments in both simulation and real-robot experiments.

Look as You Leap: Planning Simultaneous Motion and Perception for High-DOF Robots

TL;DR

PS-PRM introduces a perception-score-guided, GPU-accelerated PRM for high-DOF robots that jointly optimizes motion and perception by using a neural surrogate to estimate perception quality from SE(3) camera poses. The method integrates visibility constraints, occlusion-aware ray casting, and batch GPU processing to enable real-time replanning in dynamic environments, with probabilistic completeness preserved. Empirical results across simulation and real robotic platforms show consistent improvements in perception performance and planning efficiency, with neural surrogates delivering up to ~10× preprocessing speedups while maintaining comparable detection accuracy. The work demonstrates practical benefits for perception-critical tasks in human-centered settings, such as nursing and home environments, and outlines future directions in domain adaptation, uncertainty modeling, and broader perception tasks.

Abstract

Most common tasks for robots in dynamic spaces require that the environment is regularly and actively perceived, with many of them explicitly requiring objects or persons to be within view, i.e., for monitoring or safety. However, solving motion and perception tasks simultaneously is challenging, as these objectives often impose conflicting requirements. Furthermore, while robots must react quickly to changes in the environment, directly evaluating the quality of perception (e.g., object detection confidence) is often expensive or infeasible at runtime. This problem is especially important in human-centered environments, such as homes and hospitals, where effective perception is essential for safe and reliable operation. In this work, we address the challenge of solving motion planning problems for high-degree-of-freedom (DoF) robots from a start to a goal configuration with continuous perception constraints under both static and dynamic environments. We propose a GPU-parallelized perception-score-guided probabilistic roadmap planner with a neural surrogate model (PS-PRM). Unlike existing active perception-, visibility-aware or learning-based planners, our work integrates perception tasks and constraints directly into the motion planning formulation. Our method uses a neural surrogate model to approximate perception scores, incorporates them into the roadmap, and leverages GPU parallelism to enable efficient online replanning in dynamic settings. We demonstrate that our planner, evaluated on high-DoF robots, outperforms baseline methods in both static and dynamic environments in both simulation and real-robot experiments.

Paper Structure

This paper contains 26 sections, 6 equations, 13 figures, 5 tables, 2 algorithms.

Figures (13)

  • Figure 1: The robot (a UR5 mounted on a differential drive base) moves through a cluttered environment while maintaining high perception performance. In this example, the robot must continuously track a monitor located down the perpendicular hallway using its wrist-mounted camera. Top: Illustration of the robot’s trajectory (fading from right to left) as it moves while maintaining a high detection rate of the monitor. Bottom: Camera views corresponding to the robot configurations shown in the top figure.
  • Figure 2: Illustration of the PS-PRM pipeline for perception-aware motion planning. Bottom: the environment with multiple robot configurations. Middle blocks: PRM generates candidate paths, a neural surrogate model and occlusion checking run in parallel to evaluate perception scores, and PS-PRM selects the optimal path based on predicted perception quality. Top: the camera view confirms differing perception scores across configurations.
  • Figure 3: Illustration of the camera model, depicting both the line of sight (orange) and the camera frustum (purple). A cup is included to demonstrate when it intersecting with the line of sight or falling within the camera frustum.
  • Figure 4: Pipeline for perception-aware evaluation using a neural surrogate model and occlusion-aware ray tracing. From left to right: the environment contains the target (human) and an occluding object (orange box); a surrogate model predicts a dense perception score field; ray tracing incorporates occlusion to refine the score; and camera views at viewpoints A and B illustrate visibility differences. High perception scores correspond to views with minimal occlusion.
  • Figure 5: Pipeline for evaluating perception scores for multiple robot configurations in parallel. For each batch of configurations, camera poses are computed using FK, while occlusion checking via ray casting and perception score estimation via a neural surrogate model are performed concurrently. The outputs are combined to yield final perception scores for each configuration.
  • ...and 8 more figures