Learning Dynamic Weight Adjustment for Spatial-Temporal Trajectory Planning in Crowd Navigation
Muqing Cao, Xinhang Xu, Yizhuo Yang, Jianping Li, Tongxing Jin, Pengfei Wang, Tzu-Yi Hung, Guosheng Lin, Lihua Xie
TL;DR
The paper tackles crowd navigation by learning to adapt objective weights in a spatial-temporal trajectory planner. It introduces a neural policy trained with PPO in a POMDP, mapping sensor observations to a weight vector $\bm{w}=(w_T,w_f,w_{\dot{\theta}},w_s,w_h)$ for the planner, with rich observation encoding that includes static and dynamic maps. The approach integrates a 5th-order polynomial trajectory representation, soft feasibility penalties, and multi-objective costs, optimized over time while balancing safety, efficiency, and goal attainment. Results from simulation and a real-world 300 m corridor demonstrate improved safety and adaptability compared with fixed-weight planning and other learning-based baselines, highlighting practical impact for autonomous delivery and service robots in human crowds.
Abstract
Robot navigation in dense human crowds poses a significant challenge due to the complexity of human behavior in dynamic and obstacle-rich environments. In this work, we propose a dynamic weight adjustment scheme using a neural network to predict the optimal weights of objectives in an optimization-based motion planner. We adopt a spatial-temporal trajectory planner and incorporate diverse objectives to achieve a balance among safety, efficiency, and goal achievement in complex and dynamic environments. We design the network structure, observation encoding, and reward function to effectively train the policy network using reinforcement learning, allowing the robot to adapt its behavior in real time based on environmental and pedestrian information. Simulation results show improved safety compared to the fixed-weight planner and the state-of-the-art learning-based methods, and verify the ability of the learned policy to adaptively adjust the weights based on the observed situations. The approach's feasibility is demonstrated in a navigation task using an autonomous delivery robot across a crowded corridor over a 300 m distance.
