Path planning with moving obstacles using stochastic optimal control
Seyyed Reza Jafari, Anders Hansson, Bo Wahlberg
TL;DR
This work tackles autonomous path planning in the presence of moving obstacles by casting the problem as an infinite-horizon stochastic optimal control problem (SSP) and addressing the curse of dimensionality via geometric symmetry. The authors derive a reduced three-dimensional state representation (e, d, theta) through SO(2) symmetry and solve a reduced Bellman equation with fitted value iteration, followed by online rollout to implement a receding-horizon controller. They introduce a two-term cost that balances fast target-reaching and safe separation from the obstacle, and demonstrate that the method outperforms control barrier function and receding-horizon A* approaches on average, with a clear trade-off controlled by lambda. The framework is validated numerically with a large discretization and sample-based value function approximation, showing substantial improvements in planning efficiency and safety in dynamic environments. Potential extensions include three-dimensional environments, moving targets, and multi-obstacle scenarios, as well as integration with learning-based methods in unknown settings.
Abstract
Navigating a collision-free, optimal path for a robot poses a perpetual challenge, particularly in the presence of moving objects such as humans. In this study, we formulate the problem of finding an optimal path as a stochastic optimal control problem. However, obtaining a solution to this problem is nontrivial. Therefore, we consider a simplified problem, which is more tractable. For this simplified formulation, we are able to solve the corresponding Bellman equation. However, the solution obtained from the simplified problem does not sufficiently address the original problem of interest. To address the full problem, we propose a numerical procedure where we solve an optimization problem at each sampling instant. The solution to the simplified problem is integrated into the online formulation as a final-state penalty. We illustrate the efficiency of the proposed method using a numerical example.
