Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

Martin Moder; Stephen Adhisaputra; Josef Pauli

Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

Martin Moder, Stephen Adhisaputra, Josef Pauli

TL;DR

This work tackles robust robot navigation in crowded environments by combining goal-conditioned generative models trained on human crowd data with Sampling-based Model Predictive Control (SMPC). It introduces goal-conditioned Neural Autoregressive (NAR) and Neural Inverse Autoregressive (NIAR) models to forecast human actions, integrating these forecasts into a Model Predictive Path Integral (MPPI) planner that respects robot dynamics via a Dynamic Window Constraint. A multi-faceted reward structure, including a map-based cost, collision penalties, human-imitation likelihood, and a Social Influence Reward, guides planning toward safe, human-like yet efficient trajectories; an Adaptive Sub-goal Navigation strategy ties local planning to a global map to avoid local minima. Real-world LoCoBot demonstrations, and extensive evaluations on ETH/UCY/Wildtrack datasets, show the MPPI-NAR/NIAR approach yields higher success rates and lower collision rates than baselines including DWA and offline RL methods, while highlighting practical considerations such as data requirements and platform variability. The results underscore the value of integrating goal-conditioned imitation with SMPC for real-time, socially aware navigation in dynamic environments, with promising directions for richer datasets and broader robotic platforms.

Abstract

This paper addresses navigation in crowded environments by integrating goal-conditioned generative models with Sampling-based Model Predictive Control (SMPC). We introduce goal-conditioned autoregressive models to generate crowd behaviors, capturing intricate interactions among individuals. The model processes potential robot trajectory samples and predicts the reactions of surrounding individuals, enabling proactive robotic navigation in complex scenarios. Extensive experiments show that this algorithm enables real-time navigation, significantly reducing collision rates and path lengths, and outperforming selected baseline methods. The practical effectiveness of this algorithm is validated on an actual robotic platform, demonstrating its capability in dynamic settings.

Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

TL;DR

Abstract

Paper Structure (23 sections, 17 equations, 13 figures, 1 table, 5 algorithms)

This paper contains 23 sections, 17 equations, 13 figures, 1 table, 5 algorithms.

Introduction
Related Work
Learning a Policy
Planning with Generative Models
Robot Navigation as a Multiplayer Game
Methods
Sampling-Based Model Predictive Control
Model Predictive Path Integral
Neural Autoregressive Model
Neural Inverse Autoregressive Model
Reward Function
The Reward Map
Human Policy based Reward Signals
Human Goal Optimization
Adaptive Sub-goal Navigation
...and 8 more sections

Figures (13)

Figure 1: Our approach to imitation learning and planning for robotic navigation in crowded settings is model-based. (a) The dataset comprises recordings of crowd dynamics. (b) Using this dataset, a generative model is trained to forecast future position of individuals. (c) The robot, equipped with a 3D camera and 2D LiDAR sensor, detects and tracks pedestrian positions, and generates a cost map to avoid obstacles. On the left side of the images, four distinct trajectories are shown: agents’ past paths (red), predicted future paths (orange), the robot’s planned trajectory (green), and the robot’s global plan (thin red line). Cylinders represent the positions and outlines of humans. (d) The model predictive control framework, enhanced by the generative model, plans proactively robot trajectories that mimic human movements. (e) New observations can be added to the dataset, allowing the approach to scale with more data. This illustration is adapted from moder2023MIL.
Figure 2: A snapshot from the second sequence in the ETH pedestrian dataset pellegrini2010improving, showcasing a crowded street scenario.
Figure 3: MPPI planning visualization for a two-wheeled robot (LoCoBot) locobotInterbotix in the metric state space, with $dt=0.4$ s intervals. Columns show consecutive timesteps from two scenes, each depicted row-wise. Triangles indicate observed agent states. MPPI sample populations are green, with the mean best trajectory in black squares. The first state from this trajectory is executed. Human trajectory forecasts using the NAR model are shown as orange dots.
Figure 4: A snapshot from RViz shows humans as cylinders, the LoCoBot robot, and the environment as an occupancy grid map. Observed human states are in red, with the most likely predictions in orange. Human goals are chosen using Algorithm \ref{['alg:hgo']}.
Figure 5: The robot (in black) navigates to its target while avoiding simulated humans (in various colors). Filled circles indicate the current position while unfilled circles demonstrate past positions. The current position of the selected human, which is invisible to the robot, is marked with a star.
...and 8 more figures

Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

TL;DR

Abstract

Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (13)