Boundary-Aware Value Function Generation for Safe Stochastic Motion Planning

Junhong Xu; Kai Yin; Jason M. Gregory; Kris Hauser; Lantao Liu

Boundary-Aware Value Function Generation for Safe Stochastic Motion Planning

Junhong Xu, Kai Yin, Jason M. Gregory, Kris Hauser, Lantao Liu

TL;DR

This work proposes a principled boundary-aware safe stochastic planning framework that generates a value function that can strictly distinguish the state values between free (safe) and non-navigable (boundary) spaces in the continuous state, naturally leading to a safe boundary-aware policy.

Abstract

Navigation safety is critical for many autonomous systems such as self-driving vehicles in an urban environment. It requires an explicit consideration of boundary constraints that describe the borders of any infeasible, non-navigable, or unsafe regions. We propose a principled boundary-aware safe stochastic planning framework with promising results. Our method generates a value function that can strictly distinguish the state values between free (safe) and non-navigable (boundary) spaces in the continuous state, naturally leading to a safe boundary-aware policy. At the core of our solution lies a seamless integration of finite elements and kernel-based functions, where the finite elements allow us to characterize safety-critical states' borders accurately, and the kernel-based function speeds up computation for the non-safety-critical states. The proposed method was evaluated through extensive simulations and demonstrated safe navigation behaviors in mobile navigation tasks. Additionally, we demonstrate that our approach can maneuver safely and efficiently in cluttered real-world environments using a ground vehicle with strong external disturbances, such as navigating on a slippery floor and against external human intervention.

Boundary-Aware Value Function Generation for Safe Stochastic Motion Planning

TL;DR

Abstract

Paper Structure (33 sections, 49 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 33 sections, 49 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Deterministic and Stochastic Motion Planning
Markov Decision Process and Value Function Approximation
Safety Guarantee with Hamilton-Jacobi Reachability Analysis
Preliminaries
Markov Decision Processes
Necessity of Continuous State MDP Formulation in Motion Planning Tasks
Diffusion MDP (DMDP) Approximates MDP in the Limit
Boundary Conditions of DMDP
Methods
Finite Element Methods for Policy Evaluation
Tensor product of kernel and polynomial basis functions
Policy Update
Policy Iteration Algorithm
...and 18 more sections

Figures (6)

Figure 1: The average steps to the goal measure the efficiency of the methods. For the runs that do not reach the goal, including collisions or getting stuck, we set the number of steps as 700, which is the maximum step of the simulation.
Figure 2: The multi-trial trajectories of (a) our method (2D Mesh + 1D Kernel) and (c) kernelized value function representation in one $0.25$ obstacle ratio environment demonstrate the ability of our method to navigate in confined spaces despite the motion noise. The red and green squares are the obstacles and goals, respectively. The black arrows show $50$ trajectories (state-action sequences), and the color gradients around the trajectories visualize the estimated probability density of the robot's trajectory. It illustrates the likelihood of the robot visiting each state, given its policy and the stochastic dynamics model. Figures (b) and (d) compare value functions in log scale for the $\mathrm{x}-\mathrm{y}$ dimensions at $\theta=\frac{\pi}{2}$ for the two methods. (Please note that the viewing angles of the value function surfaces have been optimized to enhance visualization. In both figures, the upper corners displaying the highest values correspond to the goal regions.)
Figure 3: The performance and computation time comparison between the varying mesh dimensions in figures (a) and (b), respectively. In (a), the $\mathrm{x}$ axis shows each element's total number of integration points. In (b), the $\mathrm{x}$ axis shows the number of integration points at one mesh-based dimension. Thus, the total number of integration points in one element is $n^d$, where $n$ is the mesh dimensions, and $d$ is the number of integration points. We use the same kernel function as the previous section for the meshless dimensions. All the methods are computed until convergence and tested in $0.25$ ratio obstacle environments.
Figure 4: Each row shows the navigation result of one obstacle configuration. The columns from left to right represent the environment, its corresponding obstacle map and multi-trial trajectories, our method's value function, and the kernelized value function, respectively. In all the figures, the green squares denote the goal locations. The first column shows snapshots of obstacle configurations where the robot is placed at the starting position; The robot starting positions are highlighted as red stars in both the obstacle maps and the visualizations of the value function in the subsequent columns. (Note, the viewing perspective in the first column differs from other columns due to spatial constraints that restricted our options for capturing angles.) The value functions in the last two columns are shown in log scale for the $(\mathrm{x},\mathrm{y})$ dimensions at $\theta = \frac{\pi}{2}$. The grey squares in the second column represent the obstacles and the goal, and the arrows represent the trajectories of $4$ runs. We ignore the trajectories of the kernelized value function since none of the runs arrived at the goal. The noisy trajectory distributions are due to navigating on the smooth floor at a relatively high speed. This also implies that since this drift effect is hard to model, it is essential to consider uncertainty during planning.
Figure 5: (a-e) Snapshots of navigation under external disturbances. The designated goal is shown in the green circular region. In the early stages of navigation, depicted in (a) and (b), the robot moves towards the goal. However, the occurrence of external disturbances as a result of human interventions, as indicated by red arrows in (b) and (c), led to a significant deviation from the intended trajectory, resulting in a $180$ degree rotation of the robot's heading angle, represented by a blue arrow. Despite these hindrances, the robot demonstrates its adaptive capabilities by efficiently readjusting its trajectory and successfully reaching the designated goal. (f) A snapshot of visualization of the constructed obstacles and the robot trajectory.
...and 1 more figures

Boundary-Aware Value Function Generation for Safe Stochastic Motion Planning

TL;DR

Abstract

Boundary-Aware Value Function Generation for Safe Stochastic Motion Planning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)