Table of Contents
Fetching ...

Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement Learning

Shuijing Liu, Peixin Chang, Weihang Liang, Neeloy Chakraborty, Katherine Driggs-Campbell

TL;DR

DS-RNN tackles decentralized robot crowd navigation under partial observability by modeling interactions as a decentralized spatio-temporal graph and learning end-to-end with model-free reinforcement learning. It decomposes decision making into spatial-edge, temporal-edge, and node factors via RNNs and an attention module, trained with PPO to maximize $V(s)$ where $V(s)=\mathbb{E}[R_t|s_t=s]$ and $R_t=\sum_{k=0}^{\infty} \gamma^{k} r_{t+k}$. In simulation and real-world TurtleBot deployments, DS-RNN outperforms reaction-based methods and prior learning-based approaches in dense crowds and partial observability, while transferring effectively to real hardware. The work contributes (1) the DS-RNN architecture, (2) end-to-end model-free RL training without expert supervision, and (3) demonstrated improvements in challenging navigation scenarios with plans to incorporate mutual robot-human interactions and raw camera inputs in future work.

Abstract

Safe and efficient navigation through human crowds is an essential capability for mobile robots. Previous work on robot crowd navigation assumes that the dynamics of all agents are known and well-defined. In addition, the performance of previous methods deteriorates in partially observable environments and environments with dense crowds. To tackle these problems, we propose decentralized structural-Recurrent Neural Network (DS-RNN), a novel network that reasons about spatial and temporal relationships for robot decision making in crowd navigation. We train our network with model-free deep reinforcement learning without any expert supervision. We demonstrate that our model outperforms previous methods in challenging crowd navigation scenarios. We successfully transfer the policy learned in the simulator to a real-world TurtleBot 2i. For more information, please visit the project website at https://sites.google.com/view/crowdnav-ds-rnn/home.

Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement Learning

TL;DR

DS-RNN tackles decentralized robot crowd navigation under partial observability by modeling interactions as a decentralized spatio-temporal graph and learning end-to-end with model-free reinforcement learning. It decomposes decision making into spatial-edge, temporal-edge, and node factors via RNNs and an attention module, trained with PPO to maximize where and . In simulation and real-world TurtleBot deployments, DS-RNN outperforms reaction-based methods and prior learning-based approaches in dense crowds and partial observability, while transferring effectively to real hardware. The work contributes (1) the DS-RNN architecture, (2) end-to-end model-free RL training without expert supervision, and (3) demonstrated improvements in challenging navigation scenarios with plans to incorporate mutual robot-human interactions and raw camera inputs in future work.

Abstract

Safe and efficient navigation through human crowds is an essential capability for mobile robots. Previous work on robot crowd navigation assumes that the dynamics of all agents are known and well-defined. In addition, the performance of previous methods deteriorates in partially observable environments and environments with dense crowds. To tackle these problems, we propose decentralized structural-Recurrent Neural Network (DS-RNN), a novel network that reasons about spatial and temporal relationships for robot decision making in crowd navigation. We train our network with model-free deep reinforcement learning without any expert supervision. We demonstrate that our model outperforms previous methods in challenging crowd navigation scenarios. We successfully transfer the policy learned in the simulator to a real-world TurtleBot 2i. For more information, please visit the project website at https://sites.google.com/view/crowdnav-ds-rnn/home.

Paper Structure

This paper contains 25 sections, 9 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Real-world crowd navigation with a TurtleBot 2i. The orange cone on the floor denotes the robot goal. The TurtleBot is equipped with cameras for localization and human tracking.
  • Figure 2: Conversion from the st-graph to the factor graph. (a) St-graph representation of the crowd navigation scenario. We use $\mathrm{w}$ to denote the robot node and $\mathrm{u}_i$ to denote the $i$-th human node. (b) Unrolled st-graph for two timsteps. At timestep $t$, the node feature for the robot is $x^t_w$. The spatial edge feature between the $i$-th human and the robot is $x_{u_i w}^t$. The temporal edge feature for the robot is $x_{ww}^{t}$. (c) The corresponding factor graph. Factors are denoted by black boxes.
  • Figure 3: DS-RNN network architecture. The components for processing spatial edge features, temporal edge features, and node features are in blue, green, and yellow, respectively. Fully connected layers are denoted as $FC$.
  • Figure 4: Illustration of our simulation environment. In a $12m\times12m$$2D$ plane, the humans are represented as circles, the orientation of an agent is indicated by a red arrow, the robot is the yellow disk, and the robot's goal is the red star. We outline the borders of the robot FoV with dashed lines. The humans in the robot's FoV are blue and the humans outside are red.
  • Figure 5: Success, timeout, and collision rates w.r.t. different FoV. The numbers on the bars indicate the percentages of the corresponding bars.
  • ...and 2 more figures