Unguided Self-exploration in Narrow Spaces with Safety Region Enhanced Reinforcement Learning for Ackermann-steering Robots

Zhaofeng Tian; Zichuan Liu; Xingyu Zhou; Weisong Shi

Unguided Self-exploration in Narrow Spaces with Safety Region Enhanced Reinforcement Learning for Ackermann-steering Robots

Zhaofeng Tian, Zichuan Liu, Xingyu Zhou, Weisong Shi

TL;DR

The rectangular safety region is proposed to represent states and detect collisions for rectangular-shaped robots, and a carefully crafted reward function for reinforcement learning that does not require the waypoint guidance is proposed.

Abstract

In narrow spaces, motion planning based on the traditional hierarchical autonomous system could cause collisions due to mapping, localization, and control noises, especially for car-like Ackermann-steering robots which suffer from non-convex and non-holonomic kinematics. To tackle these problems, we leverage deep reinforcement learning which is verified to be effective in self-decision-making, to self-explore in narrow spaces without a given map and destination while avoiding collisions. Specifically, based on our Ackermann-steering rectangular-shaped ZebraT robot and its Gazebo simulator, we propose the rectangular safety region to represent states and detect collisions for rectangular-shaped robots, and a carefully crafted reward function for reinforcement learning that does not require the waypoint guidance. For validation, the robot was first trained in a simulated narrow track. Then, the well-trained model was transferred to other simulation tracks and could outperform other traditional methods including classical and learning methods. Finally, the trained model is demonstrated in the real world with our ZebraT robot.

Unguided Self-exploration in Narrow Spaces with Safety Region Enhanced Reinforcement Learning for Ackermann-steering Robots

TL;DR

Abstract

Paper Structure (19 sections, 9 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 9 equations, 6 figures, 3 tables, 1 algorithm.

INTRODUCTION
RELATED WORK
Classical Motion Planning
DRL Motion Planning
Narrow Space Motion Problem
Innovation of This Study
PROBLEM FORMULATION AND BACKGROUND
Deep Q Networks (DQN)
Deep Deterministic Policy Gradient (DDPG)
Methods
State Representation with Rectangular Safety Region
Reward Function Shaping
Reinforcement Learning Workflow
Experiments and Evaluations
State Representation Comparison
...and 4 more sections

Figures (6)

Figure 1: Collision detection, using the fixed range, a circular area will cause over coverage. Instead, the proposed rectangular safety region method would detect collisions around the rectangular area.
Figure 2: Workflow of reinforcement learning, the shown networks are used in DDPG training.
Figure 3: State representation comparison. (a) FIFR, fixed interval fixed range. (b) FIRect, fixed interval but different range to fit the rectangular contour. (c) Comparison by the number of detected collisions.
Figure 4: Training curve, the best score among every 20 episodes of five algorithms is plotted to show the training trend. Where the solid line denotes the average value of the best score of 5 random seeds and the shaded area denotes the trust region among random seeds.
Figure 5: Evaluation results, quantitive evaluation for all experiment sets and ablation studies.
...and 1 more figures

Unguided Self-exploration in Narrow Spaces with Safety Region Enhanced Reinforcement Learning for Ackermann-steering Robots

TL;DR

Abstract

Unguided Self-exploration in Narrow Spaces with Safety Region Enhanced Reinforcement Learning for Ackermann-steering Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (6)