Table of Contents
Fetching ...

Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making

Lei Zhuang, Jingdong Zhao, Yuntao Li, Zichun Xu, Liangliang Zhao, Hong Liu

TL;DR

A novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes a Co-Regulation Environmental Information Encoder (CEIE) with a Motion Planning Transformer (MPT), and achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.

Abstract

Sampling-based motion planning (SBMP) algorithms are renowned for their robust global search capabilities. However, the inherent randomness in their sampling mechanisms often result in inconsistent path quality and limited search efficiency. In response to these challenges, this work proposes a novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes an Environmental Information Semantic Encoder (EISE) with a Motion Planning Transformer (MPT). EISE converts environmental data into semantic environmental information (SEI), providing MPT with an enriched environmental comprehension. MPT leverages an attention mechanism to dynamically recalibrate its focus on SEI, task objectives, and historical planning data, refining the sampling node generation. To demonstrate the capabilities of TEMP, we train our model using a dataset comprised of planning results produced by the RRT*. EISE and MPT are collaboratively trained, enabling EISE to autonomously learn and extract patterns from environmental data, thereby forming semantic representations that MPT could more effectively interpret and utilize for motion planning. Subsequently, we conducted a systematic evaluation of TEMP's efficacy across diverse task dimensions, which demonstrates that TEMP achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.

Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making

TL;DR

A novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes a Co-Regulation Environmental Information Encoder (CEIE) with a Motion Planning Transformer (MPT), and achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.

Abstract

Sampling-based motion planning (SBMP) algorithms are renowned for their robust global search capabilities. However, the inherent randomness in their sampling mechanisms often result in inconsistent path quality and limited search efficiency. In response to these challenges, this work proposes a novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes an Environmental Information Semantic Encoder (EISE) with a Motion Planning Transformer (MPT). EISE converts environmental data into semantic environmental information (SEI), providing MPT with an enriched environmental comprehension. MPT leverages an attention mechanism to dynamically recalibrate its focus on SEI, task objectives, and historical planning data, refining the sampling node generation. To demonstrate the capabilities of TEMP, we train our model using a dataset comprised of planning results produced by the RRT*. EISE and MPT are collaboratively trained, enabling EISE to autonomously learn and extract patterns from environmental data, thereby forming semantic representations that MPT could more effectively interpret and utilize for motion planning. Subsequently, we conducted a systematic evaluation of TEMP's efficacy across diverse task dimensions, which demonstrates that TEMP achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.
Paper Structure (14 sections, 9 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 9 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Performance assessment of TEMP versus RRT* in 2D planning, focusing on paths of comparable quality. The variables $t$, $N$, and $\mathcal{J}$ denote planning time, number of nodes generated, and path cost, respectively. (a) $t$ = 0.10 s, $N$ = 75, $\mathcal{J}$ = 17.78; (b) $t$ = 6.63 s, $N$ = 1599, $\mathcal{J}$ = 17.97; (c) $t$ = 0.07 s, $N$ = 51, $\mathcal{J}$ = 18.65; (d) $t$ = 4.62 s, $N$ = 902, $\mathcal{J}$ = 18.65.
  • Figure 2: Network architecture of the Transformer-Enhanced Motion Planner, illustrating the data flow within the system, particularly highlighting how the Environmental Information Semantic Encoder and the Motion Planning Transformer process information and contribute to generating the sampling node.
  • Figure 3: Comparative analysis of planning solutions between TEMP (Red) and IRRT* (Cyan) in 2D and 3D scenarios. The symbols $t_{\text{T}}$ and $t_{\text{I}}$ represent the planning times for TEMP and IRRT*, respectively; similarly, $\mathcal{J}_{\text{T}}$ and $\mathcal{J}_{\text{I}}$ indicate the path cost for each algorithm. Due to the dense distribution of obstacles in the 3D planning environments, we have rendered those obstacles that have a relatively small impact on the path planning more transparent, to improve the clarity of the displayed results. (a) $t_{\text{T}}$ = 0.10 s, $\mathcal{J}_{\text{T}}$ = 21.87, $t_{\text{I}}$ = 1.08 s, ${{\mathcal{J}}_{\text{I}}}$ = 22.19; (b) $t_{\text{T}}$ = 0.09 s, $\mathcal{J}_{\text{T}}$ = 24.73, $t_{\text{I}}$ = 3.34 s, ${{\mathcal{J}}_{\text{I}}}$ = 26.01; (c) $t_{\text{T}}$ = 0.11 s, $\mathcal{J}_{\text{T}}$ = 23.97, $t_{\text{I}}$ = 1.58 s, ${{\mathcal{J}}_{\text{I}}}$ = 27.70; (d) $t_{\text{T}}$ = 0.41 s, $\mathcal{J}_{\text{T}}$ = 25.71, $t_{\text{I}}$ = 4.50 s, ${\mathcal{J}}_{\text{I}}$ = 29.77; (e) $t_{\text{T}}$ = 0.14 s, $\mathcal{J}_{\text{T}}$ = 20.91, $t_{\text{I}}$ = 1.78 s, $\mathcal{J}_{\text{I}}$ = 21.51; (f) $t_{\text{T}}$ = 0.17 s, $\mathcal{J}_{\text{T}}$ = 24.29, $t_{\text{I}}$ = 2.17 s, $\mathcal{J}_{\text{I}}$ = 26.59; (g) $t_{\text{T}}$ = 0.29 s, $\mathcal{J}_{\text{T}}$ = 26.10, $t_{\text{I}}$ = 3.53 s, $\mathcal{J}_{\text{I}}$ = 26.91; (h) $t_{\text{T}}$ = 0.07 s, $\mathcal{J}_{\text{T}}$ = 22.02, $t_{\text{I}}$ = 3.69 s, $\mathcal{J}_{\text{I}}$ = 22.82.
  • Figure 4: Partial intermediate configurations of the path generated by the TEMP in planning for a 7-DOF manipulator. (a) $t$ = 0.33 s; (b) $t$ = 0.34 s.
  • Figure 5: Planning success rate versus time curves for TEMP, RRT*, and IRRT* in 2D, 3D, and 7D scenarios.
  • ...and 4 more figures