Table of Contents
Fetching ...

Steering Prediction via a Multi-Sensor System for Autonomous Racing

Zhuyun Zhou, Zongwei Wu, Florian Bolli, Rémi Boutteau, Fan Yang, Radu Timofte, Dominique Ginhac, Tobi Delbruck

TL;DR

This work explores the integration of an event camera with the existing system to provide enhanced temporal information and introduces a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment.

Abstract

Autonomous racing has rapidly gained research attention. Traditionally, racing cars rely on 2D LiDAR as their primary visual system. In this work, we explore the integration of an event camera with the existing system to provide enhanced temporal information. Our goal is to fuse the 2D LiDAR data with event data in an end-to-end learning framework for steering prediction, which is crucial for autonomous racing. To the best of our knowledge, this is the first study addressing this challenging research topic. We start by creating a multisensor dataset specifically for steering prediction. Using this dataset, we establish a benchmark by evaluating various SOTA fusion methods. Our observations reveal that existing methods often incur substantial computational costs. To address this, we apply low-rank techniques to propose a novel, efficient, and effective fusion design. We introduce a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment. Our fusion architecture provides better steering prediction than LiDAR alone, significantly reducing the RMSE from 7.72 to 1.28. Compared to the second-best fusion method, our work represents only 11% of the learnable parameters while achieving better accuracy. The source code, dataset, and benchmark will be released to promote future research.

Steering Prediction via a Multi-Sensor System for Autonomous Racing

TL;DR

This work explores the integration of an event camera with the existing system to provide enhanced temporal information and introduces a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment.

Abstract

Autonomous racing has rapidly gained research attention. Traditionally, racing cars rely on 2D LiDAR as their primary visual system. In this work, we explore the integration of an event camera with the existing system to provide enhanced temporal information. Our goal is to fuse the 2D LiDAR data with event data in an end-to-end learning framework for steering prediction, which is crucial for autonomous racing. To the best of our knowledge, this is the first study addressing this challenging research topic. We start by creating a multisensor dataset specifically for steering prediction. Using this dataset, we establish a benchmark by evaluating various SOTA fusion methods. Our observations reveal that existing methods often incur substantial computational costs. To address this, we apply low-rank techniques to propose a novel, efficient, and effective fusion design. We introduce a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment. Our fusion architecture provides better steering prediction than LiDAR alone, significantly reducing the RMSE from 7.72 to 1.28. Compared to the second-best fusion method, our work represents only 11% of the learnable parameters while achieving better accuracy. The source code, dataset, and benchmark will be released to promote future research.
Paper Structure (21 sections, 10 equations, 3 figures, 3 tables)

This paper contains 21 sections, 10 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) The F1tenth racing car used in our experiments is equipped with a DAVIS346 event camera Brandli2014-davis and a Hokuyo 2D LiDAR sensor. (b) Our network processes two consecutive LiDAR scans captured at times $t$ and $t+1$, along with an event-accumulated frame that includes all "on" and "off" brightness change events occurring between $t$ and $t+1$. The network's objective is to predict the steering angle at time $t+1$. The LiDAR depth maps are depicted with a blank background, indicating areas with no data, while black pixels correspond to scan points, with the intensity of darkness reflecting proximity. We show that it is possible to leverage the joint benefit within such a multisensor system to achieve accurate steering prediction.
  • Figure 2: (a) Architecture Overview. Please, zoom in for better visualization of the BEV and 2D FOV view of the 2D LiDAR point. It is important to note that, unlike 3D LiDAR, the 2D LiDAR points form only a quasi-vertical line after projection. This characteristic makes our multisensor fusion particularly challenging, and, to the best of our knowledge, this specific issue is being addressed for the first time. The overall architecture follows a conventional feature extraction, fusion, and decoding pipeline. However, we introduce a novel fusion method with a new learning policy, as illustrated in (b), to fully exploit the mutual benefits between the 2D LiDAR and event camera, resulting in an efficient yet effective fusion strategy that maximizes joint entropy.
  • Figure 3: Comparison against existing fusion methods. Our approach begins by projecting the input features into a lower-dimensional latent space to reduce complexity. We then introduce a novel attention mechanism based on gated convolution. This combination makes our fusion technique both computationally efficient and highly effective, surpassing existing (transformer) attention methods.