Table of Contents
Fetching ...

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

Shuo Xing, Hongyuan Hua, Xiangbo Gao, Shenzhe Zhu, Renjie Li, Kexin Tian, Xiaopeng Li, Heng Huang, Tianbao Yang, Zhangyang Wang, Yang Zhou, Huaxiu Yao, Zhengzhong Tu

TL;DR

AutoTrust presents the first holistic benchmark to evaluate the trustworthiness of DriveVLMs along five dimensions—truthfulness, safety, robustness, privacy, and fairness—across eight open datasets and six VLMs. Through comprehensive trustfulness, safety, robustness, privacy, and fairness evaluations, the study reveals vulnerabilities in DriveVLMs, notably privacy leakage and fragility under adversarial and out-of-distribution conditions, while generalist models often outperform driving-specialist baselines in overall trustworthiness. The GPT-4o-based evaluation framework demonstrates scalable, human-aligned scoring, complemented by human validation, and uncovers surprising generalist superiority and risk profiles across model families. These findings underscore the urgency of standardized, multi-dimensional trustworthiness assessments for DriveVLMs and provide a public dataset and codebase to catalyze safer deployment in autonomous driving systems.

Abstract

Recent advancements in large vision language models (VLMs) tailored for autonomous driving (AD) have shown strong scene understanding and reasoning capabilities, making them undeniable candidates for end-to-end driving systems. However, limited work exists on studying the trustworthiness of DriveVLMs -- a critical factor that directly impacts public transportation safety. In this paper, we introduce AutoTrust, a comprehensive trustworthiness benchmark for large vision-language models in autonomous driving (DriveVLMs), considering diverse perspectives -- including trustfulness, safety, robustness, privacy, and fairness. We constructed the largest visual question-answering dataset for investigating trustworthiness issues in driving scenarios, comprising over 10k unique scenes and 18k queries. We evaluated six publicly available VLMs, spanning from generalist to specialist, from open-source to commercial models. Our exhaustive evaluations have unveiled previously undiscovered vulnerabilities of DriveVLMs to trustworthiness threats. Specifically, we found that the general VLMs like LLaVA-v1.6 and GPT-4o-mini surprisingly outperform specialized models fine-tuned for driving in terms of overall trustworthiness. DriveVLMs like DriveLM-Agent are particularly vulnerable to disclosing sensitive information. Additionally, both generalist and specialist VLMs remain susceptible to adversarial attacks and struggle to ensure unbiased decision-making across diverse environments and populations. Our findings call for immediate and decisive action to address the trustworthiness of DriveVLMs -- an issue of critical importance to public safety and the welfare of all citizens relying on autonomous transportation systems. We release all the codes and datasets in https://github.com/taco-group/AutoTrust.

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

TL;DR

AutoTrust presents the first holistic benchmark to evaluate the trustworthiness of DriveVLMs along five dimensions—truthfulness, safety, robustness, privacy, and fairness—across eight open datasets and six VLMs. Through comprehensive trustfulness, safety, robustness, privacy, and fairness evaluations, the study reveals vulnerabilities in DriveVLMs, notably privacy leakage and fragility under adversarial and out-of-distribution conditions, while generalist models often outperform driving-specialist baselines in overall trustworthiness. The GPT-4o-based evaluation framework demonstrates scalable, human-aligned scoring, complemented by human validation, and uncovers surprising generalist superiority and risk profiles across model families. These findings underscore the urgency of standardized, multi-dimensional trustworthiness assessments for DriveVLMs and provide a public dataset and codebase to catalyze safer deployment in autonomous driving systems.

Abstract

Recent advancements in large vision language models (VLMs) tailored for autonomous driving (AD) have shown strong scene understanding and reasoning capabilities, making them undeniable candidates for end-to-end driving systems. However, limited work exists on studying the trustworthiness of DriveVLMs -- a critical factor that directly impacts public transportation safety. In this paper, we introduce AutoTrust, a comprehensive trustworthiness benchmark for large vision-language models in autonomous driving (DriveVLMs), considering diverse perspectives -- including trustfulness, safety, robustness, privacy, and fairness. We constructed the largest visual question-answering dataset for investigating trustworthiness issues in driving scenarios, comprising over 10k unique scenes and 18k queries. We evaluated six publicly available VLMs, spanning from generalist to specialist, from open-source to commercial models. Our exhaustive evaluations have unveiled previously undiscovered vulnerabilities of DriveVLMs to trustworthiness threats. Specifically, we found that the general VLMs like LLaVA-v1.6 and GPT-4o-mini surprisingly outperform specialized models fine-tuned for driving in terms of overall trustworthiness. DriveVLMs like DriveLM-Agent are particularly vulnerable to disclosing sensitive information. Additionally, both generalist and specialist VLMs remain susceptible to adversarial attacks and struggle to ensure unbiased decision-making across diverse environments and populations. Our findings call for immediate and decisive action to address the trustworthiness of DriveVLMs -- an issue of critical importance to public safety and the welfare of all citizens relying on autonomous transportation systems. We release all the codes and datasets in https://github.com/taco-group/AutoTrust.

Paper Structure

This paper contains 76 sections, 6 equations, 18 figures, 34 tables, 1 algorithm.

Figures (18)

  • Figure 1: We present AutoTrust, a comprehensive benchmark for assessing the trustworthiness of large vision language models for autonomous driving (i.e. DriveVLMs), covering five key dimensions: Trustfulness (§ \ref{['sec:trustfulness']}), Safety (§ \ref{['sec:safety']}), Robustness (§ \ref{['sec:robustness-ood']}), Privacy (§ \ref{['sec:privacy']}), and Fairness (§ \ref{['sec:fairness']}). Our evaluation uncovers significant trustworthiness issues in existing DriveVLMs, underscoring an urgent need for attention and action to address these critical concerns.
  • Figure 2: Ego Fairness Evaluation Results: Scatter plot showing the weighted average of each model's Demographic Accuracy Difference (DAD) and Worst Accuracy (WA) for the age, gender, and race attributes of the driver object, as well as the type, color, and brand attributes of the ego car object, across the CoVLA, DriveLM, and LingoQA datasets. Points closer to the top-right indicate better overall performance.
  • Figure 3: Scene Fairness Evaluation Results: Heat map of model performance (Accuracy %) across type and color of surrounding vehicles object.
  • Figure 4: Factuality evaluation: Left: Radar Chart of Close-ended Question. Right: Radar Chart of Open-ended Question
  • Figure 5: Demonstration of Image-level attack. After adding carefully designed and perceptually invisible perturbation to the clean image, leading model output incorrect answer. We amplify the intensity of perturbation for 10 times for better visualization.
  • ...and 13 more figures