SAP-CoPE: Social-Aware Planning using Cooperative Pose Estimation with Infrastructure Sensor Nodes
Minghao Ning, Yufeng Yang, Shucheng Huang, Jiaming Zhong, Keqi Shu, Chen Sun, Ehsan Hashemi, Amir Khajepour
TL;DR
The paper addresses perception limitations and occlusions in indoor autonomous systems by proposing SAP-CoPE, a framework that combines Infrastructure Sensor Nodes for cooperative perception with a probabilistic 3D human pose estimation method and an MPC-based controller. A novel pose estimation pipeline fuses image data with sparse LiDAR measurements, propagates uncertainty through the camera-to-world projection via a Jacobian, and enforces joint coherence with bone-length constraints, while the MPC planner integrates an Artificial Potential Field and a Personal Space field to yield socially comfortable trajectories. The framework supports single- or multi-camera configurations and includes a delay-aware global perception layer to compensate latency across sensors, demonstrating real-time capability and robustness in both simulation and real-world experiments. Overall, SAP-CoPE delivers socially aware navigation that respects personal space, improves safety, and enhances human-robot interaction in dynamic indoor environments, with potential impact on healthcare, logistics, and service robotics.
Abstract
Autonomous driving systems must operate safely in human-populated indoor environments, where challenges such as limited perception and occlusion sensitivity arise when relying solely on onboard sensors. These factors generate difficulties in the accurate recognition of human intentions and the generation of comfortable, socially aware trajectories. To address these issues, we propose SAP-CoPE, a social-aware planning framework that integrates cooperative infrastructure with a novel 3D human pose estimation method and a model predictive control-based controller. This real-time framework formulates an optimization problem that accounts for uncertainty propagation in the camera projection matrix while ensuring human joint coherence. The proposed method is adaptable to single- or multi-camera configurations and can incorporate sparse LiDAR point-cloud data. To enhance safety and comfort in human environments, we integrate a human personal space field based on human pose into a model predictive controller, enabling the system to navigate while avoiding discomfort zones. Extensive evaluations in both simulated and real-world settings demonstrate the effectiveness of our approach in generating socially aware trajectories for autonomous systems.
