Table of Contents
Fetching ...

Parallel Neural Computing for Scene Understanding from LiDAR Perception in Autonomous Racing

Suwesh Prasad Sah

TL;DR

This paper addresses real-time scene understanding for high-speed autonomous racing using LiDAR data. It introduces the Parallel Perception Network (PPN), which runs a segmentation network and a reconstruction network in parallel on separate GPUs to process sequences of BEV maps derived from 3D LiDAR point clouds. The approach employs a MSSCE loss combining MSE, SmoothL1, and edge-preserving terms to train both networks, demonstrating a twofold speedup over sequential baselines on RACECAR LiDAR data. The work highlights the practicality of hardware-enabled parallelism for multi-network perception and sets the stage for future multi-sensor parallel perception in high-speed autonomous racing.

Abstract

Autonomous driving in high-speed racing, as opposed to urban environments, presents significant challenges in scene understanding due to rapid changes in the track environment. Traditional sequential network approaches may struggle to meet the real-time knowledge and decision-making demands of an autonomous agent covering large displacements in a short time. This paper proposes a novel baseline architecture for developing sophisticated models capable of true hardware-enabled parallelism, achieving neural processing speeds that mirror the agent's high velocity. The proposed model (Parallel Perception Network (PPN)) consists of two independent neural networks, segmentation and reconstruction networks, running parallelly on separate accelerated hardware. The model takes raw 3D point cloud data from the LiDAR sensor as input and converts it into a 2D Bird's Eye View Map on both devices. Each network independently extracts its input features along space and time dimensions and produces outputs parallelly. The proposed method's model is trained on a system with two NVIDIA T4 GPUs, using a combination of loss functions, including edge preservation, and demonstrates a 2x speedup in model inference time compared to a sequential configuration. Implementation is available at: https://github.com/suwesh/Parallel-Perception-Network. Learned parameters of the trained networks are provided at: https://huggingface.co/suwesh/ParallelPerceptionNetwork.

Parallel Neural Computing for Scene Understanding from LiDAR Perception in Autonomous Racing

TL;DR

This paper addresses real-time scene understanding for high-speed autonomous racing using LiDAR data. It introduces the Parallel Perception Network (PPN), which runs a segmentation network and a reconstruction network in parallel on separate GPUs to process sequences of BEV maps derived from 3D LiDAR point clouds. The approach employs a MSSCE loss combining MSE, SmoothL1, and edge-preserving terms to train both networks, demonstrating a twofold speedup over sequential baselines on RACECAR LiDAR data. The work highlights the practicality of hardware-enabled parallelism for multi-network perception and sets the stage for future multi-sensor parallel perception in high-speed autonomous racing.

Abstract

Autonomous driving in high-speed racing, as opposed to urban environments, presents significant challenges in scene understanding due to rapid changes in the track environment. Traditional sequential network approaches may struggle to meet the real-time knowledge and decision-making demands of an autonomous agent covering large displacements in a short time. This paper proposes a novel baseline architecture for developing sophisticated models capable of true hardware-enabled parallelism, achieving neural processing speeds that mirror the agent's high velocity. The proposed model (Parallel Perception Network (PPN)) consists of two independent neural networks, segmentation and reconstruction networks, running parallelly on separate accelerated hardware. The model takes raw 3D point cloud data from the LiDAR sensor as input and converts it into a 2D Bird's Eye View Map on both devices. Each network independently extracts its input features along space and time dimensions and produces outputs parallelly. The proposed method's model is trained on a system with two NVIDIA T4 GPUs, using a combination of loss functions, including edge preservation, and demonstrates a 2x speedup in model inference time compared to a sequential configuration. Implementation is available at: https://github.com/suwesh/Parallel-Perception-Network. Learned parameters of the trained networks are provided at: https://huggingface.co/suwesh/ParallelPerceptionNetwork.

Paper Structure

This paper contains 10 sections, 4 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Conversion of 3D point clouds into 2D BEV map. This process involves: (a) Point clouds in 3D space. (b) Voxelization, where the 3D space is divided into discrete voxels and each voxel holds the max z-axis value. (c) 2D BEV map obtained by projecting 3D voxels onto a 2D plane by taking the maximum along z-axis.
  • Figure 2: Overview of PPN. The segmentation network with skip connections is a spatio-temporal pyramid network, and the reconstruction network is an autoencoder.
  • Figure 3: Architecture of Parallel Perception Network.
  • Figure 4: PPN model's experimental setup on parallel accelerated hardware.
  • Figure 5: RGB image of segmented output map with hand-annotated motion information, the red box shows the current position of the vehicle and the red line shows its motion from $(t-15)^{th}$ to $t^{th}$ time frame.
  • ...and 2 more figures