SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning
Jianpeng Yao, Xiaopan Zhang, Yu Xia, Zejin Wang, Amit K. Roy-Chowdhury, Jiachen Li
TL;DR
SoNIC addresses safety-critical social navigation by fusing Adaptive Conformal Inference (ACI) with Constrained Reinforcement Learning (CRL). By augmenting observations with online uncertainty and guiding policy learning through a Lagrangian objective that penalizes pedestrian-buffer intrusions, it achieves state-of-the-art safety and social-norm adherence on CrowdNav, including strong robustness to distribution shifts. The approach includes a flexible prediction module (CV or GST), an attention-based policy network, and a spatial-relaxation CRL mechanism that converts collision-rate constraints into actionable, dense costs, improving convergence and safety. Real-robot experiments in ROS2 demonstrate robust, socially polite behavior in dense crowds, indicating practical viability for real-world deployment.
Abstract
Reinforcement learning (RL) enables social robots to generate trajectories without relying on human-designed rules or interventions, making it generally more effective than rule-based systems in adapting to complex, dynamic real-world scenarios. However, social navigation is a safety-critical task that requires robots to avoid collisions with pedestrians, whereas existing RL-based solutions often fall short of ensuring safety in complex environments. In this paper, we propose SoNIC, which to the best of our knowledge is the first algorithm that integrates adaptive conformal inference (ACI) with constrained reinforcement learning (CRL) to enable safe policy learning for social navigation. Specifically, our method not only augments RL observations with ACI-generated nonconformity scores, which inform the agent of the quantified uncertainty but also employs these uncertainty estimates to effectively guide the behaviors of RL agents by using constrained reinforcement learning. This integration regulates the behaviors of RL agents and enables them to handle safety-critical situations. On the standard CrowdNav benchmark, our method achieves a success rate of 96.93%, which is 11.67% higher than the previous state-of-the-art RL method and results in 4.5 times fewer collisions and 2.8 times fewer intrusions to ground-truth human future trajectories as well as enhanced robustness in out-of-distribution scenarios. To further validate our approach, we deploy our algorithm on a real robot by developing a ROS2-based navigation system. Our experiments demonstrate that the system can generate robust and socially polite decision-making when interacting with both sparse and dense crowds. The video demos can be found on our project website: https://sonic-social-nav.github.io/.
