Table of Contents
Fetching ...

ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events

Aizaz Sharif, Dusica Marijan

TL;DR

ReMAV tackles the challenge of validating autonomous vehicles by first extracting a behavior-based reward representation from offline state-action trajectories using AIRL, then defining a scenario-specific threshold to identify uncertain states. It performs targeted, minimal perturbations in those regions to reveal likely failure events, reducing the search space compared with naive adversarial testing. The framework demonstrates superior efficiency and higher failure discovery across single- and multi-agent driving scenarios relative to baselines and some existing frameworks. This approach offers a scalable, black-box testing pathway that can adapt to diverse AV architectures and simulation environments, accelerating safety validation in real-world deployments.

Abstract

Autonomous vehicles are advanced driving systems that are well known to be vulnerable to various adversarial attacks, compromising vehicle safety and posing a risk to other road users. Rather than actively training complex adversaries by interacting with the environment, there is a need to first intelligently find and reduce the search space to only those states where autonomous vehicles are found to be less confident. In this paper, we propose a black-box testing framework ReMAV that uses offline trajectories first to analyze the existing behavior of autonomous vehicles and determine appropriate thresholds to find the probability of failure events. To this end, we introduce a three-step methodology which i) uses offline state action pairs of any autonomous vehicle under test, ii) builds an abstract behavior representation using our designed reward modeling technique to analyze states with uncertain driving decisions, and iii) uses a disturbance model for minimal perturbation attacks where the driving decisions are less confident. Our reward modeling technique helps in creating a behavior representation that allows us to highlight regions of likely uncertain behavior even when the standard autonomous vehicle performs well. We perform our experiments in a high-fidelity urban driving environment using three different driving scenarios containing single- and multi-agent interactions. Our experiment shows an increase in 35, 23, 48, and 50% in the occurrences of vehicle collision, road object collision, pedestrian collision, and offroad steering events, respectively by the autonomous vehicle under test, demonstrating a significant increase in failure events. We compare ReMAV with two baselines and show that ReMAV demonstrates significantly better effectiveness in generating failure events compared to the baselines in all evaluation metrics.

ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events

TL;DR

ReMAV tackles the challenge of validating autonomous vehicles by first extracting a behavior-based reward representation from offline state-action trajectories using AIRL, then defining a scenario-specific threshold to identify uncertain states. It performs targeted, minimal perturbations in those regions to reveal likely failure events, reducing the search space compared with naive adversarial testing. The framework demonstrates superior efficiency and higher failure discovery across single- and multi-agent driving scenarios relative to baselines and some existing frameworks. This approach offers a scalable, black-box testing pathway that can adapt to diverse AV architectures and simulation environments, accelerating safety validation in real-world deployments.

Abstract

Autonomous vehicles are advanced driving systems that are well known to be vulnerable to various adversarial attacks, compromising vehicle safety and posing a risk to other road users. Rather than actively training complex adversaries by interacting with the environment, there is a need to first intelligently find and reduce the search space to only those states where autonomous vehicles are found to be less confident. In this paper, we propose a black-box testing framework ReMAV that uses offline trajectories first to analyze the existing behavior of autonomous vehicles and determine appropriate thresholds to find the probability of failure events. To this end, we introduce a three-step methodology which i) uses offline state action pairs of any autonomous vehicle under test, ii) builds an abstract behavior representation using our designed reward modeling technique to analyze states with uncertain driving decisions, and iii) uses a disturbance model for minimal perturbation attacks where the driving decisions are less confident. Our reward modeling technique helps in creating a behavior representation that allows us to highlight regions of likely uncertain behavior even when the standard autonomous vehicle performs well. We perform our experiments in a high-fidelity urban driving environment using three different driving scenarios containing single- and multi-agent interactions. Our experiment shows an increase in 35, 23, 48, and 50% in the occurrences of vehicle collision, road object collision, pedestrian collision, and offroad steering events, respectively by the autonomous vehicle under test, demonstrating a significant increase in failure events. We compare ReMAV with two baselines and show that ReMAV demonstrates significantly better effectiveness in generating failure events compared to the baselines in all evaluation metrics.
Paper Structure (55 sections, 12 equations, 14 figures, 7 tables, 4 algorithms)

This paper contains 55 sections, 12 equations, 14 figures, 7 tables, 4 algorithms.

Figures (14)

  • Figure 1: Illustration of ReMAV framework architecture for testing the robustness of AV driving policies in a multi-agent environment. The framework is divided into three steps: The left represents the first step where we use the AV under test to obtain its offline trajectories in different driving scenarios. The middle represents the phase where offline trajectories are used to obtain a behavior representation with the help of a reward model $R_{\psi}$. The same reward model is used to collect {state, action, reward} pairs to collect the required thresholds $\beta$ for testing. Last, the right side of the architecture shows the AV testing phase, where noise perturbations are added only to those state-action interactions where AV feels uncertain even when their driving behavior seems normal.
  • Figure 2: End-to-end AV Driving using DRL for AV agents. AV receives an input image of 168$\times$168$\times$3 which is passed to the PPO based DRL model. The actions are selected in the output layer of the AV agent and performed in the next time step of the simulation to obtain a reward and a new observation state.
  • Figure 3: Illustration of the AIRL design as part of the ReMAV framework. Both AV offline trajectories and Generator rollouts are first passed through a VGG-16 layer in order to reduce dimensions to only important features. The Discriminator tries to classify between real (non-expert AV) and fake state-action pairs. The prediction is then transformed into a reward value $R_\psi$ that is also used to calculate the advantage function to improve the generator agent policy. The generator agent finally trains on a few samples to take better actions in the simulation environment in the next loop.
  • Figure 4: Illustration of Town03 Carla urban driving environment. The upper left subfigure represents the first driving scenario (Straight) and top right represents the second driving scenario (Pedestrian). Lastly, the bottom subfigure shows our third and final driving scenario (three-way) for the experimental evaluation of AVs under test.
  • Figure 5: 2D visualization of the AV standard driving coordinates in all three scenarios.
  • ...and 9 more figures