SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

Yi Feng; Yu Ma; Qijun Chen; Ioannis Pitas; Rui Fan

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

Yi Feng, Yu Ma, Qijun Chen, Ioannis Pitas, Rui Fan

TL;DR

This work tackles freespace detection for autonomous driving by addressing two core bottlenecks: discriminative fusion of heterogeneous features and supervision guidance that accounts for model fallibility. It introduces SNE-RoadSegV2, featuring HF^2B—comprising a Holistic Attention Module, a Heterogeneous Feature Contrast Descriptor, and an Affinity-Weighted Feature Recalibrator—paired with a lightweight decoder that leverages inter-scale skip connections. The model is trained with two fallibility-aware losses, Semantic Transition-Aware Loss and Depth Inconsistency-Aware Loss, integrated into a unified objective $L = L_{BCE} + \,\lambda_S L_{STA} + \,\lambda_D L_{DIA}$. Extensive experiments across KITTI Road, Cityscapes, vKITTI2, and KITTI Semantics demonstrate state-of-the-art performance, with the method ranking 1st on the KITTI Road benchmark and showing robust improvements near semantic-transition and depth-inconsistent regions. The approach offers practical impact by delivering more coherent, reliable freespace detection under challenging conditions and paves the way for extending heterogeneous feature fusion and fallibility-aware supervision to broader semantic segmentation tasks.

Abstract

Feature-fusion networks with duplex encoders have proven to be an effective technique to solve the freespace detection problem. However, despite the compelling results achieved by previous research efforts, the exploration of adequate and discriminative heterogeneous feature fusion, as well as the development of fallibility-aware loss functions remains relatively scarce. This paper makes several significant contributions to address these limitations: (1) It presents a novel heterogeneous feature fusion block, comprising a holistic attention module, a heterogeneous feature contrast descriptor, and an affinity-weighted feature recalibrator, enabling a more in-depth exploitation of the inherent characteristics of the extracted features, (2) it incorporates both inter-scale and intra-scale skip connections into the decoder architecture while eliminating redundant ones, leading to both improved accuracy and computational efficiency, and (3) it introduces two fallibility-aware loss functions that separately focus on semantic-transition and depth-inconsistent regions, collectively contributing to greater supervision during model training. Our proposed heterogeneous feature fusion network (SNE-RoadSegV2), which incorporates all these innovative components, demonstrates superior performance in comparison to all other freespace detection algorithms across multiple public datasets. Notably, it ranks the 1st on the official KITTI Road benchmark.

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

TL;DR

. Extensive experiments across KITTI Road, Cityscapes, vKITTI2, and KITTI Semantics demonstrate state-of-the-art performance, with the method ranking 1st on the KITTI Road benchmark and showing robust improvements near semantic-transition and depth-inconsistent regions. The approach offers practical impact by delivering more coherent, reliable freespace detection under challenging conditions and paves the way for extending heterogeneous feature fusion and fallibility-aware supervision to broader semantic segmentation tasks.

Abstract

Paper Structure (25 sections, 12 equations, 13 figures, 9 tables)

This paper contains 25 sections, 12 equations, 13 figures, 9 tables.

Introduction
Related Work
Data-Driven Freespace Detection
Heterogeneous Feature Fusion
Attention Mechanisms
Methodology
Architecture Overview
Heterogeneous Feature Fusion Block
Holistic Attention Module
Heterogeneous Feature Contrast Descriptor
Affinity-Weighted Feature Recalibrator
Lightweight yet More Effective Decoder
Fallibility-Aware Loss Functions
Semantic Transition-Aware Loss
Depth Inconsistency-Aware Loss
...and 10 more sections

Figures (13)

Figure 1: An overview of our proposed SNE-RoadSegV2.
Figure 2: An illustration of our proposed heterogeneous feature fusion block, consisting of (1) a holistic attention module, (2) a heterogeneous feature contrast descriptor, and (3) an affinity-weighted feature recalibrator.
Figure 3: Comparisons among decoder architectures of RoadSeg/UNet++, UNet3+, and our proposed SNE-RoadSegV2. $w_d^{i,j}$ denotes a basic convolutional layer (Basic Conv) or depth-wise separable convolutional layer (DSConv) with batchnorm and sigmoid layers.
Figure 4: Qualitative comparisons of SoTA freespace detection algorithms on the KITTI Road dataset fritsch2013new. The results of the compared algorithms are obtained using their officially published source codes and weights: (a) freespace detection results; (b) probability maps.
Figure 5: Qualitative comparisons of state-of-the-art freespace detection algorithms on the Cityscapes dataset cordts2016cityscapes. The results are visualized with true-positive classifications in green, false-positive in blue, and false-negative in red.
...and 8 more figures

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

TL;DR

Abstract

SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (13)