Table of Contents
Fetching ...

HALO: Fault-Tolerant Safety Architecture For High-Speed Autonomous Racing

Aron Harder, Amar Kulkarni, Madhur Behl

TL;DR

HALO presents a fault-tolerant safety architecture for high-speed autonomous racing stacks, addressing runtime faults across perception, planning, control, and communication. It implements a four-node safety framework—Graceful Stop, Node Health Monitor, Topic Multiplexer, and Behavioral-Safety Monitor—driven by a Failure Mode, Effects, and Criticality Analysis (FMECA) and validated with real Indy Autonomous Challenge data. The results show HALO mitigating data-health, node-health, and behavioral-safety faults, enabling safer operation with controlled performance trade-offs. This work provides a generalizable approach to safety in autonomous cyber-physical systems and informs safety architectures for broader high-speed autonomous applications.

Abstract

The field of high-speed autonomous racing has seen significant advances in recent years, with the rise of competitions such as RoboRace and the Indy Autonomous Challenge providing a platform for researchers to develop software stacks for autonomous race vehicles capable of reaching speeds in excess of 170 mph. Ensuring the safety of these vehicles requires the software to continuously monitor for different faults and erroneous operating conditions during high-speed operation, with the goal of mitigating any unreasonable risks posed by malfunctions in sub-systems and components. This paper presents a comprehensive overview of the HALO safety architecture, which has been implemented on a full-scale autonomous racing vehicle as part of the Indy Autonomous Challenge. The paper begins with a failure mode and criticality analysis of the perception, planning, control, and communication modules of the software stack. Specifically, we examine three different types of faults - node health, data health, and behavioral-safety faults. To mitigate these faults, the paper then outlines HALO safety archetypes and runtime monitoring methods. Finally, the paper demonstrates the effectiveness of the HALO safety architecture for each of the faults, through real-world data gathered from autonomous racing vehicle trials during multi-agent scenarios.

HALO: Fault-Tolerant Safety Architecture For High-Speed Autonomous Racing

TL;DR

HALO presents a fault-tolerant safety architecture for high-speed autonomous racing stacks, addressing runtime faults across perception, planning, control, and communication. It implements a four-node safety framework—Graceful Stop, Node Health Monitor, Topic Multiplexer, and Behavioral-Safety Monitor—driven by a Failure Mode, Effects, and Criticality Analysis (FMECA) and validated with real Indy Autonomous Challenge data. The results show HALO mitigating data-health, node-health, and behavioral-safety faults, enabling safer operation with controlled performance trade-offs. This work provides a generalizable approach to safety in autonomous cyber-physical systems and informs safety architectures for broader high-speed autonomous applications.

Abstract

The field of high-speed autonomous racing has seen significant advances in recent years, with the rise of competitions such as RoboRace and the Indy Autonomous Challenge providing a platform for researchers to develop software stacks for autonomous race vehicles capable of reaching speeds in excess of 170 mph. Ensuring the safety of these vehicles requires the software to continuously monitor for different faults and erroneous operating conditions during high-speed operation, with the goal of mitigating any unreasonable risks posed by malfunctions in sub-systems and components. This paper presents a comprehensive overview of the HALO safety architecture, which has been implemented on a full-scale autonomous racing vehicle as part of the Indy Autonomous Challenge. The paper begins with a failure mode and criticality analysis of the perception, planning, control, and communication modules of the software stack. Specifically, we examine three different types of faults - node health, data health, and behavioral-safety faults. To mitigate these faults, the paper then outlines HALO safety archetypes and runtime monitoring methods. Finally, the paper demonstrates the effectiveness of the HALO safety architecture for each of the faults, through real-world data gathered from autonomous racing vehicle trials during multi-agent scenarios.

Paper Structure

This paper contains 30 sections, 7 figures, 5 tables, 3 algorithms.

Figures (7)

  • Figure 1: An overview of the autonomous racing setup in the Indy Autonomous Challenge held at the Indianapolis Motor Speedway and the Las Vegas Motor Speedway. The top-left gives an example of the radios used for communication between race control, the base station, and the vehicles on track. The top-right shows a LiDAR point cloud used to detect obstacles on track. In the center is an overtake trajectory, as performed during head-to-head racing. The bottom-left is an image of an AV-21 Autonomous Racecar used in the Indy Autonomous Challenge. The bottom-central shows the typical base station setup in pit lane. The bottom-right shows the different race flags that may be given to a vehicle by race control.
  • Figure 2: An overview of the software stack and the flow of data between nodes. The Control Module, shown in red, uses the wheel speed, engine RPM, and vehicle localization to generate actuator commands to drive the vehicle. The localization module, shown in blue, fuses several localization sources to generate a single best position estimate for the vehicle. The communication module, shown in green, is responsible for communicating with the CANBus and base station.
  • Figure 3: The Graceful Stop Node provides data health monitoring, and brings the vehicle to a gradual stop upon detecting a fault. Each data health fault has its own severity. Yellow represents recoverable faults, while orange represents non-recoverable faults.
  • Figure 4: An overview of the Node Health Monitor and Topic Multiplexer Safety Nodes.
  • Figure 5: The transition polygons defined on the Las Vegas Motor Speedway Track. Using these polygons, the vehicle can detect whether it is in the pits or on the track (Pit Exit and Pit Entry), beside the pit boxes (Pit Slowdown and Speed Up), or if it is in the passing zone (Passing Zone Start and End).
  • ...and 2 more figures