SACSoN: Scalable Autonomous Control for Social Navigation

Noriaki Hirose; Dhruv Shah; Ajay Sridhar; Sergey Levine

SACSoN: Scalable Autonomous Control for Social Navigation

Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine

TL;DR

SACSoN tackles socially unobtrusive robot navigation by learning from interaction-rich, autonomously collected data and by explicitly minimizing counterfactual perturbations to human behavior. The approach combines a vision-based policy with predictive models of pedestrian motion, enforced through objectives J_{cp} and J_{ps}, and is trained on the large HuRoN dataset collected via the autonomous HuRoN system. Key contributions include the SACSoN policy, the HuRoN data-collection platform with interaction-enhancing objectives, and the HuRoN dataset itself, which supports improved pedestrian forecasting and safer navigation. The work demonstrates measurable improvements in safety and social compliance, and shows the potential for continual improvement through daily data collection and simulation-augmented training.

Abstract

Machine learning provides a powerful tool for building socially compliant robotic systems that go beyond simple predictive models of human behavior. By observing and understanding human interactions from past experiences, learning can enable effective social navigation behaviors directly from data. In this paper, our goal is to develop methods for training policies for socially unobtrusive navigation, such that robots can navigate among humans in ways that don't disturb human behavior. We introduce a definition for such behavior based on the counterfactual perturbation of the human: if the robot had not intruded into the space, would the human have acted in the same way? By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. Instantiating this principle requires training policies to minimize their effect on human behavior, and this in turn requires data that allows us to model the behavior of humans in the presence of robots. Therefore, our approach is based on two key contributions. First, we collect a large dataset where an indoor mobile robot interacts with human bystanders. Second, we utilize this dataset to train policies that minimize counterfactual perturbation. We provide supplementary videos and make publicly available the largest-of-its-kind visual navigation dataset on our project page.

SACSoN: Scalable Autonomous Control for Social Navigation

TL;DR

Abstract

Paper Structure (16 sections, 7 equations, 15 figures, 4 tables)

This paper contains 16 sections, 7 equations, 15 figures, 4 tables.

Introduction
Related Work
Preliminaries
Learning a Socially Compliant Policy
Autonomous Data Collection System
System design
Data collection
Evaluation
Socially Compliant Navigation
The Value of Interaction-Rich Data
Discussion
Localization with Long-term Anchors
Help-and-rescue module
Trajectory Chaining for Continual Learning
Network structures
...and 1 more sections

Figures (15)

Figure 1: SACSoN is a socially unobtrusive vision-based navigation policy in the human-occupied spaces. We penalize counterfactual perturbations (gray) from the intended human trajectory (navy) and generate the compliant commands (orange).
Figure 2: Our proposed objectives $J_\text{cp}$ and $J_\text{ps}$ for training SACSoN policy.$J_\text{cp}$ penalizes the counterfactual perturbation from the estimated intented pedestrian's trajectory (left). $J_\text{ps}$ penalizes the personal space violation in the future space (right).
Figure 3: Pedestrian detection and tracking. We use a combination of YOLO and DeepSORT to detect and track pedestrians from visual observations, and estimate their relative position using the scaled depth estimates from ExAug's perception module.
Figure 4: HuRoN System overview. We design our autonomous data collection platform around a vision-based navigation system (gray) that uses a topological graph and a learned control policy. Our proposed system has three key components: a help-and-rescue module for collision recovery (orange), long-term anchors for localization (blue), and continual learning (yellow).
Figure 5: Data collection platform. We collect spherical and fisheye RGB images, 2D LiDAR, global odometry (using long-term visual anchors), and bumper signals using our robotic system.
...and 10 more figures

SACSoN: Scalable Autonomous Control for Social Navigation

TL;DR

Abstract

SACSoN: Scalable Autonomous Control for Social Navigation

Authors

TL;DR

Abstract

Table of Contents

Figures (15)