Table of Contents
Fetching ...

Adap-RPF: Adaptive Trajectory Sampling for Robot Person Following in Dynamic Crowded Environments

Weixi Situ, Hanjing Ye, Jianwei Peng, Yu Zhan, Hong Zhang

TL;DR

Adap-RPF addresses robot person following in dynamic crowded environments by proposing a hierarchical framework that densely samples candidate following points within the target's social zones and evaluates them with a prediction-aware, multi-objective cost to generate a proactive following trajectory. This trajectory is tracked by a prediction-aware MPPI controller that accounts for predicted motions of surrounding pedestrians, enabling proactive collision avoidance. Key contributions include the Sobol-based target-centric candidate sampling, the multi-objective evaluation framework with explicit occlusion and proxemic terms, and the integration of predicted pedestrian trajectories into the MPPI controller. Experiments on a public benchmark and real-world Robot tests demonstrate improved target visibility, safety, and motion smoothness over state-of-the-art baselines across diverse dynamic scenarios.

Abstract

Robot person following (RPF) is a core capability in human-robot interaction, enabling robots to assist users in daily activities, collaborative work, and other service scenarios. However, achieving practical RPF remains challenging due to frequent occlusions, particularly in dynamic and crowded environments. Existing approaches often rely on fixed-point following or sparse candidate-point selection with oversimplified heuristics, which cannot adequately handle complex occlusions caused by moving obstacles such as pedestrians. To address these limitations, we propose an adaptive trajectory sampling method that generates dense candidate points within socially aware zones and evaluates them using a multi-objective cost function. Based on the optimal point, a person-following trajectory is estimated relative to the predicted motion of the target. We further design a prediction-aware model predictive path integral (MPPI) controller that simultaneously tracks this trajectory and proactively avoids collisions using predicted pedestrian motions. Extensive experiments show that our method outperforms state-of-the-art baselines in smoothness, safety, robustness, and human comfort, with its effectiveness further demonstrated on a mobile robot in real-world scenarios.

Adap-RPF: Adaptive Trajectory Sampling for Robot Person Following in Dynamic Crowded Environments

TL;DR

Adap-RPF addresses robot person following in dynamic crowded environments by proposing a hierarchical framework that densely samples candidate following points within the target's social zones and evaluates them with a prediction-aware, multi-objective cost to generate a proactive following trajectory. This trajectory is tracked by a prediction-aware MPPI controller that accounts for predicted motions of surrounding pedestrians, enabling proactive collision avoidance. Key contributions include the Sobol-based target-centric candidate sampling, the multi-objective evaluation framework with explicit occlusion and proxemic terms, and the integration of predicted pedestrian trajectories into the MPPI controller. Experiments on a public benchmark and real-world Robot tests demonstrate improved target visibility, safety, and motion smoothness over state-of-the-art baselines across diverse dynamic scenarios.

Abstract

Robot person following (RPF) is a core capability in human-robot interaction, enabling robots to assist users in daily activities, collaborative work, and other service scenarios. However, achieving practical RPF remains challenging due to frequent occlusions, particularly in dynamic and crowded environments. Existing approaches often rely on fixed-point following or sparse candidate-point selection with oversimplified heuristics, which cannot adequately handle complex occlusions caused by moving obstacles such as pedestrians. To address these limitations, we propose an adaptive trajectory sampling method that generates dense candidate points within socially aware zones and evaluates them using a multi-objective cost function. Based on the optimal point, a person-following trajectory is estimated relative to the predicted motion of the target. We further design a prediction-aware model predictive path integral (MPPI) controller that simultaneously tracks this trajectory and proactively avoids collisions using predicted pedestrian motions. Extensive experiments show that our method outperforms state-of-the-art baselines in smoothness, safety, robustness, and human comfort, with its effectiveness further demonstrated on a mobile robot in real-world scenarios.

Paper Structure

This paper contains 23 sections, 16 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Adap-RPF in a real-world dynamic crowded environment. The robot proactively avoids dynamic occlusions (the pedestrian outlined in red) by adaptively sampling trajectory based on predicted human motion (colored arrows). Dashed outlines indicate agent positions at time $T_0$, while solid outlines indicate their positions at time $T_1$. The golden star marks the selected optimal following point, and golden circles indicate the candidate points.
  • Figure 2: RPF system pipline. We integrate Adap-RPF framework into the system, where the light yellow modules indicate our contributions. Our framework consists of three components: human trajectory prediction, adaptive trajectory sampling and prediction-aware MPPI controller. Additional RPF system modules, are adopted from work ye2025rpf and are not the focus of this paper. Overall, the proposed RPF system can locate, track, and follow a target person while proactively avoiding occlusions in dynamic, crowded environments.
  • Figure 3: Target-centric Adaptive Following Trajectory Sampling. (a) Following Trajectory Sampling. Candidate following points are generated using Sobol sampling within a target-centric semi-annular region defined by the target’s personal and social zones. The candidates are evaluated using a multi-objective cost function that accounts for target visibility, collision risk, social compliance, and smoothness. compliance, and smoothness. A following trajectory is then constructed relative to the predicted target trajectory using an offset ($d$, $\theta$) based on the selected optimal point. (b) Proximity Cost (Eq. \ref{['eq:proximity']}). Minimum distance $d_{1}$ to surrounding pedestrians is computed from predicted trajectories and social zones; candidates intruding into personal space (e.g., $\mathbf{C}_{t}^{3}$, $\mathbf{C}_{t}^{1}$) are discarded, and $\mathbf{C}_{t}^{2}$ is selected. (c) Occlusion Cost (Eq. \ref{['eq:occlusion']}). Occlusion is estimated via IoU between pedestrian and target projections ($\mathbf{P}_{t+N}^{h}$ and $\mathbf{P}_{t+N}^\mathrm{Tar}$) using predicted trajectories; $\mathbf{C}_{t}^{1}$ is rejected due to partial occlusion (light gray), whereas $\mathbf{C}_{t}^{2}$ remains visible. (d) Distance Cost (Eq. \ref{['eq:dis']}). Penalizes deviation from the desired spacing to the target. (e) Travel Cost (Eq. \ref{['eq:travel']}). Encourages minimal path effort from the robot’s current position (green rectangle $\mathbf{P}_{t}^{R}$). (f) Stickiness Cost (Eq. \ref{['eq:stick']}). Promotes continuity by favoring points near the previously selected following point.
  • Figure 4: Representative dynamic-crowd scenarios in the public RPF benchmark: (a) circular crowds, (b) random crowds, (c) parallel and (d) perpendicular. The Lidar emits red beam lines, while the green dashed line represents the RPF trajectory. Blue circles indicate moving pedestrians, the green rectangle denotes the robot, and the yellow circle marks the target being followed.
  • Figure 5: Comparison of our method with optimization-based and multiple predefine points baselines across six metrics on the public benchmark. The horizontal axis represents pedestrian number($5$–$30$). Each method is tested across four scenarios: circular crowds, random crowds, parallel, and perpendicular. Results are averaged over 20 random instances per scenario and then across all scenarios.
  • ...and 1 more figures