Table of Contents
Fetching ...

AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models

Mohammad Zarei, Melanie A Jutras, Eliana Evans, Mike Tan, Omid Aaramoon

TL;DR

This paper tackles the problem of identifying rare failure modes (RFMs) in autonomous vehicle perception under long-tail conditions by introducing an Authentication framework that combines adversarially guided diffusion-based inpainting with explainability. It generates realistic RFMs around fixed objects, guided by object-detection losses, and extends the approach to multi-modal outputs while enforcing realism through consistency verification. A natural-language explainability layer using Grad-CAM overlays and GPT-4o captions translates the failures into actionable root-cause descriptions for developers and policymakers. The results demonstrate that RFMs can be produced with high realism and that the accompanying captions provide meaningful insights into attention disruptions and environmental factors, potentially improving robustness and safety in AV systems.

Abstract

Autonomous Vehicles (AVs) rely on artificial intelligence (AI) to accurately detect objects and interpret their surroundings. However, even when trained using millions of miles of real-world data, AVs are often unable to detect rare failure modes (RFMs). The problem of RFMs is commonly referred to as the "long-tail challenge", due to the distribution of data including many instances that are very rarely seen. In this paper, we present a novel approach that utilizes advanced generative and explainable AI techniques to aid in understanding RFMs. Our methods can be used to enhance the robustness and reliability of AVs when combined with both downstream model training and testing. We extract segmentation masks for objects of interest (e.g., cars) and invert them to create environmental masks. These masks, combined with carefully crafted text prompts, are fed into a custom diffusion model. We leverage the Stable Diffusion inpainting model guided by adversarial noise optimization to generate images containing diverse environments designed to evade object detection models and expose vulnerabilities in AI systems. Finally, we produce natural language descriptions of the generated RFMs that can guide developers and policymakers to improve the safety and reliability of AV systems.

AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models

TL;DR

This paper tackles the problem of identifying rare failure modes (RFMs) in autonomous vehicle perception under long-tail conditions by introducing an Authentication framework that combines adversarially guided diffusion-based inpainting with explainability. It generates realistic RFMs around fixed objects, guided by object-detection losses, and extends the approach to multi-modal outputs while enforcing realism through consistency verification. A natural-language explainability layer using Grad-CAM overlays and GPT-4o captions translates the failures into actionable root-cause descriptions for developers and policymakers. The results demonstrate that RFMs can be produced with high realism and that the accompanying captions provide meaningful insights into attention disruptions and environmental factors, potentially improving robustness and safety in AV systems.

Abstract

Autonomous Vehicles (AVs) rely on artificial intelligence (AI) to accurately detect objects and interpret their surroundings. However, even when trained using millions of miles of real-world data, AVs are often unable to detect rare failure modes (RFMs). The problem of RFMs is commonly referred to as the "long-tail challenge", due to the distribution of data including many instances that are very rarely seen. In this paper, we present a novel approach that utilizes advanced generative and explainable AI techniques to aid in understanding RFMs. Our methods can be used to enhance the robustness and reliability of AVs when combined with both downstream model training and testing. We extract segmentation masks for objects of interest (e.g., cars) and invert them to create environmental masks. These masks, combined with carefully crafted text prompts, are fed into a custom diffusion model. We leverage the Stable Diffusion inpainting model guided by adversarial noise optimization to generate images containing diverse environments designed to evade object detection models and expose vulnerabilities in AI systems. Finally, we produce natural language descriptions of the generated RFMs that can guide developers and policymakers to improve the safety and reliability of AV systems.

Paper Structure

This paper contains 19 sections, 1 equation, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Authentication Pipeline: We use the Segment Anything Model kirillov2023segment to extract the environment mask around the car. The mask of the environment, along with a prompt, the original image, and the object detector model, is then fed into our adversarially guided diffusion model to generate RFMs.
  • Figure 2: Drone and truck seed images on the left, generated RFM images in center and right.
  • Figure 3: Generated images with environmental conditions of night and glare, fog, reflections and yellow foliage, overcast sky, and snow from left to right with seed image in top left.
  • Figure 4: RFM status maintained when original image (left) is altered by changing car color to red (center) or blue (right).
  • Figure 5: RFM status maintained when original image (left) is altered by changing season to winter (right).
  • ...and 3 more figures