UNet-Based Keypoint Regression for 3D Cone Localization in Autonomous Racing

Mariia Baidachna; James Carty; Aidan Ferguson; Joseph Agrane; Varad Kulkarni; Aubrey Agub; Michael Baxendale; Aaron David; Rachel Horton; Elliott Atkinson

UNet-Based Keypoint Regression for 3D Cone Localization in Autonomous Racing

Mariia Baidachna, James Carty, Aidan Ferguson, Joseph Agrane, Varad Kulkarni, Aubrey Agub, Michael Baxendale, Aaron David, Rachel Horton, Elliott Atkinson

TL;DR

This work presents a UNet-based neural network for keypoint detection on cones, leveraging the largest custom-labeled dataset the authors have assembled and achieving substantial improvements in keypoint accuracy over conventional methods.

Abstract

Accurate cone localization in 3D space is essential in autonomous racing for precise navigation around the track. Approaches that rely on traditional computer vision algorithms are sensitive to environmental variations, and neural networks are often trained on limited data and are infeasible to run in real time. We present a UNet-based neural network for keypoint detection on cones, leveraging the largest custom-labeled dataset we have assembled. Our approach enables accurate cone position estimation and the potential for color prediction. Our model achieves substantial improvements in keypoint accuracy over conventional methods. Furthermore, we leverage our predicted keypoints in the perception pipeline and evaluate the end-to-end autonomous system. Our results show high-quality performance across all metrics, highlighting the effectiveness of this approach and its potential for adoption in competitive autonomous racing systems.

UNet-Based Keypoint Regression for 3D Cone Localization in Autonomous Racing

TL;DR

Abstract

Paper Structure (13 sections, 4 equations, 10 figures, 1 table)

This paper contains 13 sections, 4 equations, 10 figures, 1 table.

Introduction
Related Work
Current Limitations
Methods
System Background
Dataset Curation
Model Architecture and Training
Other Approaches
Cone Localization
Results
Overall Perception Pipeline
Real-Time Processing Analysis
Discussion and Conclusion

Figures (10)

Figure 1: Overview of the camera pipeline. Different cone position estimation methods are performed in parallel and combined with an Extended Kalman Filter. Red and Green represent system inputs and outputs respectively, with blue representing pipeline sub-routines.
Figure 2: A ZED2 left camera frame with YOLOv8 bounding boxes. Solid bars above cone annotations represent the confidence of the YOLOv8 predictions.
Figure 3: General keypoint labels and their corresponding labels on a blue and yellow cone.
Figure 4: A high-level overview of the architecture used in our KPR model.
Figure 5: The training and validation loss values at each step plotted throughout model training.
...and 5 more figures

UNet-Based Keypoint Regression for 3D Cone Localization in Autonomous Racing

TL;DR

Abstract

UNet-Based Keypoint Regression for 3D Cone Localization in Autonomous Racing

Authors

TL;DR

Abstract

Table of Contents

Figures (10)