Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

Yurim Jeon; E In Son; Seung-Woo Seo

Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

Yurim Jeon, E In Son, Seung-Woo Seo

TL;DR

This work tackles off-road traversability estimation by jointly leveraging geometric (surface slope via surface normals) and visual (semantic) cues with a robot-aware self-supervised signal. It introduces a guide filter network (GFN) to fuse multi-modal information and a footprint supervision module (FSM) to learn robot-dependent traversability from pre-drive footprints, enabling scalability across diverse platforms. Empirical results on RELLIS-3D and ORFD demonstrate improved path-planning safety and competitive freespace detection, with real-time performance suitable for onboard deployment. The approach advances robust, platform-agnostic navigation in unstructured terrains, reducing reliance on human-labeled data while capturing robot-specific traversal characteristics.

Abstract

In this study, we address the off-road traversability estimation problem, that predicts areas where a robot can navigate in off-road environments. An off-road environment is an unstructured environment comprising a combination of traversable and non-traversable spaces, which presents a challenge for estimating traversability. This study highlights three primary factors that affect a robot's traversability in an off-road environment: surface slope, semantic information, and robot platform. We present two strategies for estimating traversability, using a guide filter network (GFN) and footprint supervision module (FSM). The first strategy involves building a novel GFN using a newly designed guide filter layer. The GFN interprets the surface and semantic information from the input data and integrates them to extract features optimized for traversability estimation. The second strategy involves developing an FSM, which is a self-supervision module that utilizes the path traversed by the robot in pre-driving, also known as a footprint. This enables the prediction of traversability that reflects the characteristics of the robot platform. Based on these two strategies, the proposed method overcomes the limitations of existing methods, which require laborious human supervision and lack scalability. Extensive experiments in diverse conditions, including automobiles and unmanned ground vehicles, herbfields, woodlands, and farmlands, demonstrate that the proposed method is compatible for various robot platforms and adaptable to a range of terrains. Code is available at https://github.com/yurimjeon1892/FtFoot.

Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

TL;DR

Abstract

Paper Structure (17 sections, 4 equations, 6 figures, 2 tables)

This paper contains 17 sections, 4 equations, 6 figures, 2 tables.

INTRODUCTION
RELATED WORK
METHODS
Overview
Guide Filter Network (GFN)
Background: Dynamic filter layer
Guide filter layer
Footprint Supervision Module (FSM)
Random walk
Loss function
EXPERIMENTS
Dataset
Experiments on Path Planning
Experimental setting
Experimental results
...and 2 more sections

Figures (6)

Figure 1: Overview of the proposed method The proposed method consists of two components: a guide filter network optimized for feature extraction and fusion in off-road traversability estimation, and a footprint supervision module that learns traversability in a self-supervised manner from the robot's path, referred to as the footprint.
Figure 2: Structure of the proposed method The guide filter network (GFN) consists of an extraction network and a fusion network. The extraction network estimates the surface normal image $p_{sn}$ from the input RGB-D image $x_{rgbd}$. Subsequently, a fusion network composed of guide filter layers integrates information from different modalities. Finally, the footprint supervision module (FSM) predicts the traversability map $p_{trav}$.
Figure 3: Structure of the guide filter layer The guide filter layer takes two inputs: the guidance image $x_{g}$ and the convolve image $x_{c}$. First, these images are fed into a confidence-generating layer, which produces weighted images $x'_{g}$ and $x'_{c}$. These are then forwarded to the filter-generating layer to generate two decomposed filters, $K'$ and $K"$. Finally, these filters are sequentially applied to the initial input $x_{c}$, resulting in the output $y$.
Figure 4: Structure of the footprint supervision module The footprint supervision module takes an input feature map, $F(x)$. This module consists of a random walk (RW) and a convolutional layer (C). $Tr$ represents the transformation function. $p_{trav}$ is the final traversability map. The inference path is represented by the light blue box.
Figure 5: Examples of the dataset Sample images from the RELLIS-3D and ORFD datasets.
...and 1 more figures

Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

TL;DR

Abstract

Follow the Footprints: Self-supervised Traversability Estimation for Off-road Vehicle Navigation based on Geometric and Visual Cues

Authors

TL;DR

Abstract

Table of Contents

Figures (6)