A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

Shiwei Lian; Feitian Zhang

A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

Shiwei Lian, Feitian Zhang

TL;DR

The paper tackles transferability in DRL-based robotic navigation without a global map by introducing a scene similarity framework with global and local indicators. It couples an improved image template matching method with a local map observation that fuses 2D LiDAR data, agent position, and goal information, enabling robust policy learning and sensor-FoV interchangeability. The global similarity $SS_{ m global}$ and local similarity $SS_{ m local}$ quantify scene closeness and safety risk, and are shown to correlate with navigation success across 26 simulated and real-world scenes. Findings demonstrate that the local map-based DRL approach yields higher transferability and safety, with the similarity metrics guiding training scene design and deployment considerations in practical robotic navigation.

Abstract

While deep reinforcement learning (DRL) has attracted a rapidly growing interest in solving the problem of navigation without global maps, DRL typically leads to a mediocre navigation performance in practice due to the gap between the training scene and the actual test scene. To quantify the transferability of a DRL agent between the training and test scenes, this paper proposes a new transferability metric -- the scene similarity calculated using an improved image template matching algorithm. Specifically, two transferability performance indicators are designed including the global scene similarity that evaluates the overall robustness of a DRL algorithm and the local scene similarity that serves as a safety measure when a DRL agent is deployed without a global map. In addition, this paper proposes the use of a local map that fuses 2D LiDAR data with spatial information of both the agent and the destination as the DRL observation, aiming to improve the transferability of DRL navigation algorithms. With a wheeled robot as the case study platform, both simulation and real-world experiments are conducted in a total of 26 different scenes. The experimental results affirm the robustness of the local map observation design and demonstrate the strong correlation between the scene similarity metric and the success rate of DRL navigation algorithms.

A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

TL;DR

and local similarity

quantify scene closeness and safety risk, and are shown to correlate with navigation success across 26 simulated and real-world scenes. Findings demonstrate that the local map-based DRL approach yields higher transferability and safety, with the similarity metrics guiding training scene design and deployment considerations in practical robotic navigation.

Abstract

Paper Structure (25 sections, 16 equations, 14 figures, 2 tables)

This paper contains 25 sections, 16 equations, 14 figures, 2 tables.

Introduction
Related Work
Robot Navigation
DRL Transferability
Problem Description
Local Map-Based DRL Navigation
Deep Reinforcement Learning
Using Local Maps as Observations
Scene Similarity Metric Based On Improved Image Template Matching
Image Template Matching Algorithm
Global Scene Similarity
Local Scene Similarity
Case Study on Local Scene Similarity Metric
Case Study Implementation Details
Simulation Results
...and 10 more sections

Figures (14)

Figure 1: The schematic of the generation of the local map at time $t$.
Figure 2: The schematic of the adopted network architecture for DQN and the actor-critic network of TD3. Conv2D (kernel size, stride, number of kernels) is the 2D convolutional layer. Maxpool2D (kernel size, stride) is the 2D max pooling layer. Fc (number of hidden units) is the fully connected layer. The activation function is ReLU.
Figure 3: Illustration of the calculation of the global scene similarity.
Figure 4: (a) Illustration of problem #1 using conventional image template matching method. The center of the best match lies within the obstacle region that the robot agent cannot reach in practice. (b) Illustration of problem #2 using conventional image template matching method. It shows all trajectories stored in the replay buffer. A point in darker red indicates that the observation at that point is more likely to be sampled in the training and thereby better learnt than a point in lighter red.
Figure 5: 12 test scenes constructed in the PyBullet simulator and the sampled trajectories of the robot agent trained by the local map-based DQN algorithm. All the test scenes have the dimension of 10m$\times$10m. The order of the scenes is sorted from the highest to the lowest $SS_{\rm local}$.
...and 9 more figures

A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

TL;DR

Abstract

A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

Authors

TL;DR

Abstract

Table of Contents

Figures (14)