Table of Contents
Fetching ...

Learning Orientation Field for OSM-Guided Autonomous Navigation

Yuming Huang, Wei Gao, Zhiyuan Zhang, Maani Ghaffari, Dezhen Song, Cheng-Zhong Xu, Hui Kong

TL;DR

The paper addresses the challenge of reliable autonomous navigation with noisy OSM data by introducing Orientation Field (OrField), a data-driven yet explainable representation that fuses LiDAR perception with OSM route guidance. A two-stage approach first constructs an initial OrField from OSM and then optimizes it with LiDAR inputs via a SalsaNext-based network, producing a robust field used by two planners, Field-RRT* and Field-Bezier, to generate trajectories that align with road geometry. Experiments on SemanticKITTI and a campus dataset show that OrField-guided planning improves robustness and accuracy, particularly in complex scenarios and under sensor occlusions, with end-to-end baselines lagging in generalization. The work demonstrates strong performance across frame- and trajectory-level metrics and offers ablations validating orientation-centric planning over traditional free-space methods, along with publicly available code for training and deployment.

Abstract

OpenStreetMap (OSM) has gained popularity recently in autonomous navigation due to its public accessibility, lower maintenance costs, and broader geographical coverage. However, existing methods often struggle with noisy OSM data and incomplete sensor observations, leading to inaccuracies in trajectory planning. These challenges are particularly evident in complex driving scenarios, such as at intersections or facing occlusions. To address these challenges, we propose a robust and explainable two-stage framework to learn an Orientation Field (OrField) for robot navigation by integrating LiDAR scans and OSM routes. In the first stage, we introduce the novel representation, OrField, which can provide orientations for each grid on the map, reasoning jointly from noisy LiDAR scans and OSM routes. To generate a robust OrField, we train a deep neural network by encoding a versatile initial OrField and output an optimized OrField. Based on OrField, we propose two trajectory planners for OSM-guided robot navigation, called Field-RRT* and Field-Bezier, respectively, in the second stage by improving the Rapidly Exploring Random Tree (RRT) algorithm and Bezier curve to estimate the trajectories. Thanks to the robustness of OrField which captures both global and local information, Field-RRT* and Field-Bezier can generate accurate and reliable trajectories even in challenging conditions. We validate our approach through experiments on the SemanticKITTI dataset and our own campus dataset. The results demonstrate the effectiveness of our method, achieving superior performance in complex and noisy conditions. Our code for network training and real-world deployment is available at https://github.com/IMRL/OriField.

Learning Orientation Field for OSM-Guided Autonomous Navigation

TL;DR

The paper addresses the challenge of reliable autonomous navigation with noisy OSM data by introducing Orientation Field (OrField), a data-driven yet explainable representation that fuses LiDAR perception with OSM route guidance. A two-stage approach first constructs an initial OrField from OSM and then optimizes it with LiDAR inputs via a SalsaNext-based network, producing a robust field used by two planners, Field-RRT* and Field-Bezier, to generate trajectories that align with road geometry. Experiments on SemanticKITTI and a campus dataset show that OrField-guided planning improves robustness and accuracy, particularly in complex scenarios and under sensor occlusions, with end-to-end baselines lagging in generalization. The work demonstrates strong performance across frame- and trajectory-level metrics and offers ablations validating orientation-centric planning over traditional free-space methods, along with publicly available code for training and deployment.

Abstract

OpenStreetMap (OSM) has gained popularity recently in autonomous navigation due to its public accessibility, lower maintenance costs, and broader geographical coverage. However, existing methods often struggle with noisy OSM data and incomplete sensor observations, leading to inaccuracies in trajectory planning. These challenges are particularly evident in complex driving scenarios, such as at intersections or facing occlusions. To address these challenges, we propose a robust and explainable two-stage framework to learn an Orientation Field (OrField) for robot navigation by integrating LiDAR scans and OSM routes. In the first stage, we introduce the novel representation, OrField, which can provide orientations for each grid on the map, reasoning jointly from noisy LiDAR scans and OSM routes. To generate a robust OrField, we train a deep neural network by encoding a versatile initial OrField and output an optimized OrField. Based on OrField, we propose two trajectory planners for OSM-guided robot navigation, called Field-RRT* and Field-Bezier, respectively, in the second stage by improving the Rapidly Exploring Random Tree (RRT) algorithm and Bezier curve to estimate the trajectories. Thanks to the robustness of OrField which captures both global and local information, Field-RRT* and Field-Bezier can generate accurate and reliable trajectories even in challenging conditions. We validate our approach through experiments on the SemanticKITTI dataset and our own campus dataset. The results demonstrate the effectiveness of our method, achieving superior performance in complex and noisy conditions. Our code for network training and real-world deployment is available at https://github.com/IMRL/OriField.

Paper Structure

This paper contains 20 sections, 12 figures, 5 tables.

Figures (12)

  • Figure 1: The comparison of our method with the previous step-by-step and end-to-end approach. (a): The correct planning result when observations and OSM routes contain minimal noise. (b): The step-by-step planning result caused by incomplete observations and noisy OSM routes. (c): The end-to-end approach predicts an infeasible trajectory drawn from the distribution of training trajectories. (d): Our method plans the correct trajectory based on orientations (gray arrowed) predicted by the deep network. In all figures, the red dashed line represents the OSM route, and the green line represents the planned trajectory.
  • Figure 2: The visualization of the trajectory planning results of the typical implementation of the RRT* planner (purple) and our Field-RRT* planner (green). The red trajectory represents the OSM trajectory. Only the orientations within the free space are shown for clarity.
  • Figure 3: The examples of the LiDAR BEV, the initial OrField, the distance map, and the optimized OrField. (i): The global topometric map. (ii): The navigation route in sensor swath and its corresponding Bezier curve. (a): An example of BEV grid map generated by LiDAR scan with the average intensity, maximum height, and the number of points as features (colored in red, green, and blue, respectively). (b1): The initial OrField at the intersection of (a). The orientation vector $\textbf{n}$ is decided by the tangent of the closest point on the Bezier curve generated by the navigation route. (b2): The distance map at the intersection of (a). The distance is the distance to the closest point on the Bezier curve. (c): The optimized OrField optimized by our trained deep network.
  • Figure 4: The training sample generation. A training pair consists of an initial OrField and a orientation label generated from a full observation aggregating past, current, and future frames. (i) The BEV of the point cloud aggregated by the consecutive past, current, and future LiDAR scans with odometry information. (ii) The free space, frontiers, and the Dijkstra shortest path connecting the selected source and target frontiers in (i). At the right, it shows the Euclidean distance transform (EDT) and orientation calculation based on the EDT. (a) the EDT of the free space. (b) the gradient direction of (a). (c) the direction perpendicular to (b). (d) the Dijkstra shortest path tree generated from the target frontier. (e) the direction, both perpendicular to (b) and coherent with the direction of (d). (f) the inverse EDT of the free space. (g) the gradient direction of (f). (h) the combination of (e) and (g).
  • Figure 5: An example of a training batch with 8 samples. On the left are the feature distributions of this batch, and on the right is the visualization of 4 examples in this batch. Row 1 on the left: Distributions of 5 features for network inputs. From the left to the right are the average intensity, the maximum height, the number of points, the direction along the x-axis, and the direction along the y-axis. Row 2 on the left: The batch-normalized distributions of these features. Row 3 on the left: The instance-normalized distributions of these features. Samples are shown in different distributions. Row 4 on the left: The instance-normalized distributions of these features. Samples are shown in one distribution. Right: The visualization of 4 examples in this batch.
  • ...and 7 more figures