Table of Contents
Fetching ...

Collision-Free Robot Navigation in Crowded Environments using Learning based Convex Model Predictive Control

Zhuanglei Wen, Mingze Dong, Xiai Chen

TL;DR

This paper uniquely bridge the robot’s perception, decision-making and control processes by utilizing the convex obstacle-free region computed from 2D LiDAR data.

Abstract

Navigating robots safely and efficiently in crowded and complex environments remains a significant challenge. However, due to the dynamic and intricate nature of these settings, planning efficient and collision-free paths for robots to track is particularly difficult. In this paper, we uniquely bridge the robot's perception, decision-making and control processes by utilizing the convex obstacle-free region computed from 2D LiDAR data. The overall pipeline is threefold: (1) We proposes a robot navigation framework that utilizes deep reinforcement learning (DRL), conceptualizing the observation as the convex obstacle-free region, a departure from general reliance on raw sensor inputs. (2) We design the action space, derived from the intersection of the robot's kinematic limits and the convex region, to enable efficient sampling of inherently collision-free reference points. These actions assists in guiding the robot to move towards the goal and interact with other obstacles during navigation. (3) We employ model predictive control (MPC) to track the trajectory formed by the reference points while satisfying constraints imposed by the convex obstacle-free region and the robot's kinodynamic limits. The effectiveness of proposed improvements has been validated through two sets of ablation studies and a comparative experiment against the Timed Elastic Band (TEB), demonstrating improved navigation performance in crowded and complex environments.

Collision-Free Robot Navigation in Crowded Environments using Learning based Convex Model Predictive Control

TL;DR

This paper uniquely bridge the robot’s perception, decision-making and control processes by utilizing the convex obstacle-free region computed from 2D LiDAR data.

Abstract

Navigating robots safely and efficiently in crowded and complex environments remains a significant challenge. However, due to the dynamic and intricate nature of these settings, planning efficient and collision-free paths for robots to track is particularly difficult. In this paper, we uniquely bridge the robot's perception, decision-making and control processes by utilizing the convex obstacle-free region computed from 2D LiDAR data. The overall pipeline is threefold: (1) We proposes a robot navigation framework that utilizes deep reinforcement learning (DRL), conceptualizing the observation as the convex obstacle-free region, a departure from general reliance on raw sensor inputs. (2) We design the action space, derived from the intersection of the robot's kinematic limits and the convex region, to enable efficient sampling of inherently collision-free reference points. These actions assists in guiding the robot to move towards the goal and interact with other obstacles during navigation. (3) We employ model predictive control (MPC) to track the trajectory formed by the reference points while satisfying constraints imposed by the convex obstacle-free region and the robot's kinodynamic limits. The effectiveness of proposed improvements has been validated through two sets of ablation studies and a comparative experiment against the Timed Elastic Band (TEB), demonstrating improved navigation performance in crowded and complex environments.
Paper Structure (19 sections, 18 equations, 7 figures, 4 tables)

This paper contains 19 sections, 18 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Proposed Navigation Architecture: The convex obstacle-free region is obtained from 2D LiDAR point cloud data. The policy network selects reference points ($Q_{t}^{s}$ and $Q_{t}^{l}$) from this convex region based on consecutive frames of observations. The MPC is then employed with online optimization to compute optimal local trajectory. The trajectory follows the reference points closely while satisfying both kinodynamic constraints and convex obstacle-free region requirements. This iterative process continues until the robot reaches its goal.
  • Figure 2: Schematic of Short-Term and Long-Term Reference Point Formulation
  • Figure 3: Schematic of Iterative Trajectory Optimization within the Convex Region
  • Figure 4: Schematic of the Observation Vector
  • Figure 5: Network Architecture: The value network $V_{\phi}$ and policy network $\pi_{\theta}$ are structured as four-layer fully connected networks. The policy network is augmented with a logarithmic standard deviation parameter ($\ln \sigma_t$) for each action dimension. This configuration primarily facilitates the generation of stochastic policies by allowing actions to be sampled from a Gaussian distribution.
  • ...and 2 more figures