Table of Contents
Fetching ...

Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning based Local Motion Planners

Wen Zheng Terence Ng, Jianda Chen, Sinno Jialin Pan, Tianwei Zhang

TL;DR

This work introduces an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective and proposes diverse scenarios inspired by pedestrian crowd behaviors that improve an agent’s robustness against unseen crowds.

Abstract

Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.

Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning based Local Motion Planners

TL;DR

This work introduces an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective and proposes diverse scenarios inspired by pedestrian crowd behaviors that improve an agent’s robustness against unseen crowds.

Abstract

Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.

Paper Structure

This paper contains 18 sections, 5 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: A human may take diverse strategies to reach the same predefined goal (left). We propose a behavior-conditioned policy to integrate such diversity into the robot agent (right). This diversity enriches the agent with a more varied range of experiences when learning in a multi-agent framework, and improves its ability to generalize in unseen crowd behaviors.
  • Figure 2: Our framework for behavior-conditioned policy. An intrinsic reward is computed based on discriminators $q_\psi$ and $q_\phi$, which encourages the diversity by indirectly maximizing the lower variation bound $\mathbb{G}(\theta)$
  • Figure 3: The discriminator loss and reward curves.
  • Figure 4: Agents' varied paths based on sampled behaviors.
  • Figure 5: Testing our method in Gazebo with more realistic scenarios. Map settings: (Left) Warehouse (Right) Hospital