Table of Contents
Fetching ...

ForestTrav: Accurate, Efficient and Deployable Forest Traversability Estimation for Autonomous Ground Vehicles

Fabio Ruetz, Nicholas Lawrance, Emili Hernández, Paulo Borges, Thierry Peynot

TL;DR

ForestTrav tackles the demanding problem of TE for autonomous ground vehicles in dense vegetation by fusing lidar data into a high-fidelity 3D probabilistic voxel map and applying a sparse-convolutional neural network ensemble for per-voxel traversal prediction. The method operates online using a local map around the robot and leverages 13 voxel statistics stored per voxel to form rich input features, achieving state-of-the-art accuracy while requiring only modest training data and time. Extensive experiments against 3D SOTA methods and 2.5D baselines on real-world forest data show large MCC gains (e.g., MCC up to 0.62 at 0.1 m voxels) and robust performance across vegetation densities, with strong generalization to unseen environments. An open-source dataset and a robust map-quality analysis further demonstrate ForestTravs practical applicability for real-time autonomous navigation in complex vegetated terrains.

Abstract

Autonomous navigation in unstructured vegetated environments remains an open challenge. To successfully operate in these settings, ground vehicles must assess the traversability of the environment and determine which vegetation is pliable enough to push through. In this work, we propose a novel method that combines a high-fidelity and feature-rich 3D voxel representation while leveraging the structural context and sparseness of SCNN's to assess Traversability Estimation (TE) in densely vegetated environments. The proposed method is thoroughly evaluated on an accurately-labeled real-world data set that we provide to the community. It is shown to outperform state-of-the-art methods by a significant margin (0.59 vs. 0.39 MCC score at 0.1m voxel resolution) in challenging scenes and to generalize to unseen environments. In addition, the method is economical in the amount of training data and training time required: a model is trained in minutes on a desktop computer. We show that by exploiting the context of the environment, our method can use different feature combinations with only limited performance variations. For example, our approach can be used with lidar-only features, whilst still assessing complex vegetated environments accurately, which was not demonstrated previously in the literature in such environments. In addition, we propose an approach to assess a traversability estimator's sensitivity to information quality and show our method's sensitivity is low.

ForestTrav: Accurate, Efficient and Deployable Forest Traversability Estimation for Autonomous Ground Vehicles

TL;DR

ForestTrav tackles the demanding problem of TE for autonomous ground vehicles in dense vegetation by fusing lidar data into a high-fidelity 3D probabilistic voxel map and applying a sparse-convolutional neural network ensemble for per-voxel traversal prediction. The method operates online using a local map around the robot and leverages 13 voxel statistics stored per voxel to form rich input features, achieving state-of-the-art accuracy while requiring only modest training data and time. Extensive experiments against 3D SOTA methods and 2.5D baselines on real-world forest data show large MCC gains (e.g., MCC up to 0.62 at 0.1 m voxels) and robust performance across vegetation densities, with strong generalization to unseen environments. An open-source dataset and a robust map-quality analysis further demonstrate ForestTravs practical applicability for real-time autonomous navigation in complex vegetated terrains.

Abstract

Autonomous navigation in unstructured vegetated environments remains an open challenge. To successfully operate in these settings, ground vehicles must assess the traversability of the environment and determine which vegetation is pliable enough to push through. In this work, we propose a novel method that combines a high-fidelity and feature-rich 3D voxel representation while leveraging the structural context and sparseness of SCNN's to assess Traversability Estimation (TE) in densely vegetated environments. The proposed method is thoroughly evaluated on an accurately-labeled real-world data set that we provide to the community. It is shown to outperform state-of-the-art methods by a significant margin (0.59 vs. 0.39 MCC score at 0.1m voxel resolution) in challenging scenes and to generalize to unseen environments. In addition, the method is economical in the amount of training data and training time required: a model is trained in minutes on a desktop computer. We show that by exploiting the context of the environment, our method can use different feature combinations with only limited performance variations. For example, our approach can be used with lidar-only features, whilst still assessing complex vegetated environments accurately, which was not demonstrated previously in the literature in such environments. In addition, we propose an approach to assess a traversability estimator's sensitivity to information quality and show our method's sensitivity is low.
Paper Structure (25 sections, 2 equations, 11 figures, 4 tables)

This paper contains 25 sections, 2 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Top: robot navigating in a forest with low to dense vegetation. Bottom: traversability estimation using ForestTrav. Lidar data are accumulated into a probabilistic 3D voxel map. The map is then passed to a sparse convolutional neural network to estimate per-voxel traversability.
  • Figure 2: Method overview (inference): During robot deployment a continuous stream of lidar measurements are continuously fused into a single probabilistic 3D voxel map, representing the environment with per-voxel statistics. A local feature map is generated to assess the traversability of a local area around the robot's current pose. Each of the $N$ models independently classifies voxels as either traversable or non-traversable. The ensemble creates the traversability probability for each voxel by taking the mean of the $N$ binary classifications. A sample scene is shown on the top left, with the robot in red.
  • Figure 3: Training data generation: The training data is generated based on the post-processed map, using offline optimized poses. The hand-labeled data set is fused with the robot experience. The fused post-processed traversability map is split into smaller cubes suitable for training our method. This is repeated for each scene and added to the reference database, containing all data for training. The exception is the separate test set, where the scene is not split into smaller cubes. Details of each step are provided in Secs. C-F
  • Figure 4: Left: Illustration of instances of a robot traversing or colliding with environmental elements. The red bounds indicate the voxels that may cause collisions, green boxes are voxels that the robot successfully traversed. Right: The probabilistic collision map from the robot experience only. Dark purple is non-traversable, yellow traversable and green uncertain. The red arrows are discrete poses of the trajectory. The map correctly captures the voxels mostly likely to be responsible for the collisions (tree trunk) and the adjacent traversable grass, without any discretization effects.
  • Figure 5: Overview of the U-Net architecture used in this work showing the number of channels, kernel sizes, strides, and skip connections.
  • ...and 6 more figures