Table of Contents
Fetching ...

SACSoN: Scalable Autonomous Control for Social Navigation

Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine

TL;DR

SACSoN tackles socially unobtrusive robot navigation by learning from interaction-rich, autonomously collected data and by explicitly minimizing counterfactual perturbations to human behavior. The approach combines a vision-based policy with predictive models of pedestrian motion, enforced through objectives J_{cp} and J_{ps}, and is trained on the large HuRoN dataset collected via the autonomous HuRoN system. Key contributions include the SACSoN policy, the HuRoN data-collection platform with interaction-enhancing objectives, and the HuRoN dataset itself, which supports improved pedestrian forecasting and safer navigation. The work demonstrates measurable improvements in safety and social compliance, and shows the potential for continual improvement through daily data collection and simulation-augmented training.

Abstract

Machine learning provides a powerful tool for building socially compliant robotic systems that go beyond simple predictive models of human behavior. By observing and understanding human interactions from past experiences, learning can enable effective social navigation behaviors directly from data. In this paper, our goal is to develop methods for training policies for socially unobtrusive navigation, such that robots can navigate among humans in ways that don't disturb human behavior. We introduce a definition for such behavior based on the counterfactual perturbation of the human: if the robot had not intruded into the space, would the human have acted in the same way? By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. Instantiating this principle requires training policies to minimize their effect on human behavior, and this in turn requires data that allows us to model the behavior of humans in the presence of robots. Therefore, our approach is based on two key contributions. First, we collect a large dataset where an indoor mobile robot interacts with human bystanders. Second, we utilize this dataset to train policies that minimize counterfactual perturbation. We provide supplementary videos and make publicly available the largest-of-its-kind visual navigation dataset on our project page.

SACSoN: Scalable Autonomous Control for Social Navigation

TL;DR

SACSoN tackles socially unobtrusive robot navigation by learning from interaction-rich, autonomously collected data and by explicitly minimizing counterfactual perturbations to human behavior. The approach combines a vision-based policy with predictive models of pedestrian motion, enforced through objectives J_{cp} and J_{ps}, and is trained on the large HuRoN dataset collected via the autonomous HuRoN system. Key contributions include the SACSoN policy, the HuRoN data-collection platform with interaction-enhancing objectives, and the HuRoN dataset itself, which supports improved pedestrian forecasting and safer navigation. The work demonstrates measurable improvements in safety and social compliance, and shows the potential for continual improvement through daily data collection and simulation-augmented training.

Abstract

Machine learning provides a powerful tool for building socially compliant robotic systems that go beyond simple predictive models of human behavior. By observing and understanding human interactions from past experiences, learning can enable effective social navigation behaviors directly from data. In this paper, our goal is to develop methods for training policies for socially unobtrusive navigation, such that robots can navigate among humans in ways that don't disturb human behavior. We introduce a definition for such behavior based on the counterfactual perturbation of the human: if the robot had not intruded into the space, would the human have acted in the same way? By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. Instantiating this principle requires training policies to minimize their effect on human behavior, and this in turn requires data that allows us to model the behavior of humans in the presence of robots. Therefore, our approach is based on two key contributions. First, we collect a large dataset where an indoor mobile robot interacts with human bystanders. Second, we utilize this dataset to train policies that minimize counterfactual perturbation. We provide supplementary videos and make publicly available the largest-of-its-kind visual navigation dataset on our project page.
Paper Structure (16 sections, 7 equations, 15 figures, 4 tables)

This paper contains 16 sections, 7 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: SACSoN is a socially unobtrusive vision-based navigation policy in the human-occupied spaces. We penalize counterfactual perturbations (gray) from the intended human trajectory (navy) and generate the compliant commands (orange).
  • Figure 2: Our proposed objectives $J_\text{cp}$ and $J_\text{ps}$ for training SACSoN policy.$J_\text{cp}$ penalizes the counterfactual perturbation from the estimated intented pedestrian's trajectory (left). $J_\text{ps}$ penalizes the personal space violation in the future space (right).
  • Figure 3: Pedestrian detection and tracking. We use a combination of YOLO and DeepSORT to detect and track pedestrians from visual observations, and estimate their relative position using the scaled depth estimates from ExAug's perception module.
  • Figure 4: HuRoN System overview. We design our autonomous data collection platform around a vision-based navigation system (gray) that uses a topological graph and a learned control policy. Our proposed system has three key components: a help-and-rescue module for collision recovery (orange), long-term anchors for localization (blue), and continual learning (yellow).
  • Figure 5: Data collection platform. We collect spherical and fisheye RGB images, 2D LiDAR, global odometry (using long-term visual anchors), and bumper signals using our robotic system.
  • ...and 10 more figures