Table of Contents
Fetching ...

Learning to Navigate Socially Through Proactive Risk Perception

Erjia Xiao, Lingfeng Zhang, Yingbo Tang, Hao Cheng, Renjing Xu, Wenbo Ding, Lei Zhou, Long Chen, Hangjun Ye, Xiaoshuai Hao

TL;DR

The paper tackles socially compliant indoor navigation under egocentric RGB-D perception with no global maps. It extends the Falcon framework by adding a Proactive Risk Perception Module that predicts distance-based collision risk scores for nearby humans and trains with a dedicated risk loss, in addition to Falcon's existing auxiliary tasks. Key contributions include explicit risk quantification with continuous supervision, integration into a multi-task learning objective, and empirical validation on the Social-HM3D benchmark showing competitive performance and strong social compliance. The findings demonstrate that risk-aware auxiliary learning improves proactive collision avoidance and personal-space maintenance, with practical implications for robust real-world indoor navigation.

Abstract

In this report, we describe the technical details of our submission to the IROS 2025 RoboSense Challenge Social Navigation Track. This track focuses on developing RGBD-based perception and navigation systems that enable autonomous agents to navigate safely, efficiently, and socially compliantly in dynamic human-populated indoor environments. The challenge requires agents to operate from an egocentric perspective using only onboard sensors including RGB-D observations and odometry, without access to global maps or privileged information, while maintaining social norm compliance such as safe distances and collision avoidance. Building upon the Falcon model, we introduce a Proactive Risk Perception Module to enhance social navigation performance. Our approach augments Falcon with collision risk understanding that learns to predict distance-based collision risk scores for surrounding humans, which enables the agent to develop more robust spatial awareness and proactive collision avoidance behaviors. The evaluation on the Social-HM3D benchmark demonstrates that our method improves the agent's ability to maintain personal space compliance while navigating toward goals in crowded indoor scenes with dynamic human agents, achieving 2nd place among 16 participating teams in the challenge.

Learning to Navigate Socially Through Proactive Risk Perception

TL;DR

The paper tackles socially compliant indoor navigation under egocentric RGB-D perception with no global maps. It extends the Falcon framework by adding a Proactive Risk Perception Module that predicts distance-based collision risk scores for nearby humans and trains with a dedicated risk loss, in addition to Falcon's existing auxiliary tasks. Key contributions include explicit risk quantification with continuous supervision, integration into a multi-task learning objective, and empirical validation on the Social-HM3D benchmark showing competitive performance and strong social compliance. The findings demonstrate that risk-aware auxiliary learning improves proactive collision avoidance and personal-space maintenance, with practical implications for robust real-world indoor navigation.

Abstract

In this report, we describe the technical details of our submission to the IROS 2025 RoboSense Challenge Social Navigation Track. This track focuses on developing RGBD-based perception and navigation systems that enable autonomous agents to navigate safely, efficiently, and socially compliantly in dynamic human-populated indoor environments. The challenge requires agents to operate from an egocentric perspective using only onboard sensors including RGB-D observations and odometry, without access to global maps or privileged information, while maintaining social norm compliance such as safe distances and collision avoidance. Building upon the Falcon model, we introduce a Proactive Risk Perception Module to enhance social navigation performance. Our approach augments Falcon with collision risk understanding that learns to predict distance-based collision risk scores for surrounding humans, which enables the agent to develop more robust spatial awareness and proactive collision avoidance behaviors. The evaluation on the Social-HM3D benchmark demonstrates that our method improves the agent's ability to maintain personal space compliance while navigating toward goals in crowded indoor scenes with dynamic human agents, achieving 2nd place among 16 participating teams in the challenge.

Paper Structure

This paper contains 20 sections, 12 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: SocialNav task illustration, adapted from gong2025cognition. In (a), the robot navigates toward a goal (blue dashed lines) while predicting human trajectories (red dashed line) and avoiding them. The robot uses depth input as shown in (b). (c) offers a top-down map for reference, which is not used by the robot.
  • Figure 2: Method overview, adapted from gong2025cognition. Building upon the Falcon framework, our approach integrates a Proactive Risk Perception Module that operates alongside the main policy network. The policy network processes depth and GPS+Compass observations, guided by social cognition penalties. During training, the state encoder outputs are fed to both Falcon's original Spatial-Temporal Precognition Module and our proactive risk perception module. The risk module predicts distance-based collision risks for each nearby human, generating an auxiliary loss that enhances the agent's spatial awareness and collision avoidance capabilities.
  • Figure 3: Human Distribution by Scene Area in Social-HM3D (Train/Test Set): The benchmark calibrates human density based on diverse scene areas. Scenes are categorized: small spaces (0-40 m$^2$) with 0-2 humans, medium spaces (40-80 m$^2$) with 4 humans, and large spaces ($> 80$ m$^2$) with 6 humans. This area-proportional scaling ensures realistic social interaction density while preventing overcrowding in human-shared environments.
  • Figure 4: Episode demonstration. Compared to baseline, our method successfully predicts human trajectories, proactively moves to a non-obstructive position, and then avoids collisions. Green indicates safe behaviors, and red indicates collisions with humans.