Table of Contents
Fetching ...

RadarGNN: Transformation Invariant Graph Neural Network for Radar-based Perception

Felix Fent, Philipp Bauerschmidt, Markus Lienkamp

TL;DR

A novel graph neural network is proposed that does not just use the information of the points themselves but also the relationships between the points, and outperforms all previous methods on the RadarScenes dataset.

Abstract

A reliable perception has to be robust against challenging environmental conditions. Therefore, recent efforts focused on the use of radar sensors in addition to camera and lidar sensors for perception applications. However, the sparsity of radar point clouds and the poor data availability remain challenging for current perception methods. To address these challenges, a novel graph neural network is proposed that does not just use the information of the points themselves but also the relationships between the points. The model is designed to consider both point features and point-pair features, embedded in the edges of the graph. Furthermore, a general approach for achieving transformation invariance is proposed which is robust against unseen scenarios and also counteracts the limited data availability. The transformation invariance is achieved by an invariant data representation rather than an invariant model architecture, making it applicable to other methods. The proposed RadarGNN model outperforms all previous methods on the RadarScenes dataset. In addition, the effects of different invariances on the object detection and semantic segmentation quality are investigated. The code is made available as open-source software under https://github.com/TUMFTM/RadarGNN.

RadarGNN: Transformation Invariant Graph Neural Network for Radar-based Perception

TL;DR

A novel graph neural network is proposed that does not just use the information of the points themselves but also the relationships between the points, and outperforms all previous methods on the RadarScenes dataset.

Abstract

A reliable perception has to be robust against challenging environmental conditions. Therefore, recent efforts focused on the use of radar sensors in addition to camera and lidar sensors for perception applications. However, the sparsity of radar point clouds and the poor data availability remain challenging for current perception methods. To address these challenges, a novel graph neural network is proposed that does not just use the information of the points themselves but also the relationships between the points. The model is designed to consider both point features and point-pair features, embedded in the edges of the graph. Furthermore, a general approach for achieving transformation invariance is proposed which is robust against unseen scenarios and also counteracts the limited data availability. The transformation invariance is achieved by an invariant data representation rather than an invariant model architecture, making it applicable to other methods. The proposed RadarGNN model outperforms all previous methods on the RadarScenes dataset. In addition, the effects of different invariances on the object detection and semantic segmentation quality are investigated. The code is made available as open-source software under https://github.com/TUMFTM/RadarGNN.
Paper Structure (15 sections, 3 equations, 5 figures, 4 tables)

This paper contains 15 sections, 3 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Example scenario a) of the RadarScenes dataset Schumann.2021 and its corresponding radar point cloud data in the bird's eye view. The annotated ground truth data is shown in b), while the model prediction for object classes and bounding boxes is given in c).
  • Figure 2: Model overview from point cloud processing on the left, through graph construction and GNN feature extraction, up to the object detection and semantic segmentation on the right.
  • Figure 3: Definition of the translation and rotation invariant bounding box in regard to the radar point $p_0$ and the reference point $p_{nn}$. The box is defined by its extend ($w, l$), position ($d, \varphi$) and orientation ($\theta_{nn}$) in the bird's-eye view.
  • Figure 4: Confusion matrix of the semantic segmentation results on the RadarScenes test set. The matrix represents the ground truth values in contrast to the model prediction for the five objects classes and the background (bg) class.
  • Figure 5: Performance of the RadarGNN model for different invariance levels over the number of training sequences. The performance is normalized to the baseline performance.