Table of Contents
Fetching ...

A2DO: Adaptive Anti-Degradation Odometry with Deep Multi-Sensor Fusion for Autonomous Navigation

Hui Lai, Qi Chen, Junping Zhang, Jian Pu

TL;DR

A2DO tackles degraded-sensor localization for autonomous navigation by fusing LiDAR, camera, and IMU data through a multi-layer ResNet-Transformer encoder and an adaptive degradation filter. The system uses a coarse-to-fine temporal-spatial filtering strategy with Gumbel-Softmax-based feature selection and a homoscedastic loss with learnable uncertainty, enabling robust $6$-DOF pose estimation. Pretraining on diverse simulated degradation scenarios (CARLA-Loc) followed by fine-tuning on real-world data yields strong performance across fog, rain, night, and occlusion conditions, with real-time inference. The work demonstrates improved robustness over traditional and other learning-based methods and outlines practical pathways for deployment in autonomous vehicles with degraded sensors.

Abstract

Accurate localization is essential for the safe and effective navigation of autonomous vehicles, and Simultaneous Localization and Mapping (SLAM) is a cornerstone technology in this context. However, The performance of the SLAM system can deteriorate under challenging conditions such as low light, adverse weather, or obstructions due to sensor degradation. We present A2DO, a novel end-to-end multi-sensor fusion odometry system that enhances robustness in these scenarios through deep neural networks. A2DO integrates LiDAR and visual data, employing a multi-layer, multi-scale feature encoding module augmented by an attention mechanism to mitigate sensor degradation dynamically. The system is pre-trained extensively on simulated datasets covering a broad range of degradation scenarios and fine-tuned on a curated set of real-world data, ensuring robust adaptation to complex scenarios. Our experiments demonstrate that A2DO maintains superior localization accuracy and robustness across various degradation conditions, showcasing its potential for practical implementation in autonomous vehicle systems.

A2DO: Adaptive Anti-Degradation Odometry with Deep Multi-Sensor Fusion for Autonomous Navigation

TL;DR

A2DO tackles degraded-sensor localization for autonomous navigation by fusing LiDAR, camera, and IMU data through a multi-layer ResNet-Transformer encoder and an adaptive degradation filter. The system uses a coarse-to-fine temporal-spatial filtering strategy with Gumbel-Softmax-based feature selection and a homoscedastic loss with learnable uncertainty, enabling robust -DOF pose estimation. Pretraining on diverse simulated degradation scenarios (CARLA-Loc) followed by fine-tuning on real-world data yields strong performance across fog, rain, night, and occlusion conditions, with real-time inference. The work demonstrates improved robustness over traditional and other learning-based methods and outlines practical pathways for deployment in autonomous vehicles with degraded sensors.

Abstract

Accurate localization is essential for the safe and effective navigation of autonomous vehicles, and Simultaneous Localization and Mapping (SLAM) is a cornerstone technology in this context. However, The performance of the SLAM system can deteriorate under challenging conditions such as low light, adverse weather, or obstructions due to sensor degradation. We present A2DO, a novel end-to-end multi-sensor fusion odometry system that enhances robustness in these scenarios through deep neural networks. A2DO integrates LiDAR and visual data, employing a multi-layer, multi-scale feature encoding module augmented by an attention mechanism to mitigate sensor degradation dynamically. The system is pre-trained extensively on simulated datasets covering a broad range of degradation scenarios and fine-tuned on a curated set of real-world data, ensuring robust adaptation to complex scenarios. Our experiments demonstrate that A2DO maintains superior localization accuracy and robustness across various degradation conditions, showcasing its potential for practical implementation in autonomous vehicle systems.

Paper Structure

This paper contains 19 sections, 9 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A2DO framework pipeline. Raw sensor (LiDAR, Camera, IMU) data is preprocessed via 2D projection and timestamp alignment. The processed vertex, normal, and visual images are encoded by a multi-layer and multi-scale ResNet-Transformer, while normalized IMU data is handled by a lightweight LSTM. Latent features are refined through an adaptive degradation filter. Finally, an LSTM-based decoder estimates the 6-DOF vehicle pose with corresponding confidence scores.
  • Figure 2: Architecture of the Multi-layer and Multi-scale Image Encoder. The encoder uses ResNet for multi-scale feature extraction and processes them with a Transformer for cross-modal interaction.
  • Figure 3: Temporal Feature Filter.
  • Figure 4: Spatial Feature Filter.
  • Figure 5: Comparison of A2DO (Base+TF+SF) and Soft-Mask strategies on the map 05 Dynamic Foggy sequences, demonstrating superior robustness of A2DO in challenging conditions.
  • ...and 1 more figures