Table of Contents
Fetching ...

Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses

Zhiwen Fan, Pu Wang, Yang Zhao, Yibo Zhao, Boris Ivanovic, Zhangyang Wang, Marco Pavone, Hao Frank Yang

TL;DR

This work reframes traffic crash analysis as a text-based reasoning problem by creating the CrashEvent dataset, which textualizes rich, multimodal crash records from Washington State (2022). It then presents CrashLLM, a fine-tuned LLaMA-2-based framework that conducts event-level crash prediction across injury, severity, and accident type by leveraging large-language-model reasoning capabilities and task-specific special tokens. CrashLLM substantially outperforms traditional baselines and enables what-if situational analyses to explore how factors like alcohol use, icy road conditions, and work zones shift crash distributions, offering actionable insights for transportation safety. The study provides a public benchmark, demonstrating the potential of NLP-driven, causality-aware crash analysis to inform city-level safety interventions and policy planning.

Abstract

The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the intricate relationships among the complex infrastructure, environmental, human and contextual factors related to traffic crashes and risky situations. In contrast, we initially propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports and incorporating infrastructure data, environmental and traffic textual and visual information in Washington State. Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors. The proposed model, CrashLLM, distinguishes itself from existing solutions by leveraging the inherent text reasoning capabilities of LLMs to parse and learn from complex, unstructured data, thereby enabling a more nuanced analysis of contributing factors. Our experiments results shows that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes, all with averaged F1 score boosted from 34.9% to 53.8%. Furthermore, CrashLLM can provide valuable insights for numerous open-world what-if situational-awareness traffic safety analyses with learned reasoning features, which existing models cannot offer. We make our benchmark, datasets, and model public available for further exploration.

Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses

TL;DR

This work reframes traffic crash analysis as a text-based reasoning problem by creating the CrashEvent dataset, which textualizes rich, multimodal crash records from Washington State (2022). It then presents CrashLLM, a fine-tuned LLaMA-2-based framework that conducts event-level crash prediction across injury, severity, and accident type by leveraging large-language-model reasoning capabilities and task-specific special tokens. CrashLLM substantially outperforms traditional baselines and enables what-if situational analyses to explore how factors like alcohol use, icy road conditions, and work zones shift crash distributions, offering actionable insights for transportation safety. The study provides a public benchmark, demonstrating the potential of NLP-driven, causality-aware crash analysis to inform city-level safety interventions and policy planning.

Abstract

The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the intricate relationships among the complex infrastructure, environmental, human and contextual factors related to traffic crashes and risky situations. In contrast, we initially propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports and incorporating infrastructure data, environmental and traffic textual and visual information in Washington State. Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors. The proposed model, CrashLLM, distinguishes itself from existing solutions by leveraging the inherent text reasoning capabilities of LLMs to parse and learn from complex, unstructured data, thereby enabling a more nuanced analysis of contributing factors. Our experiments results shows that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes, all with averaged F1 score boosted from 34.9% to 53.8%. Furthermore, CrashLLM can provide valuable insights for numerous open-world what-if situational-awareness traffic safety analyses with learned reasoning features, which existing models cannot offer. We make our benchmark, datasets, and model public available for further exploration.
Paper Structure (24 sections, 1 equation, 10 figures, 2 tables)

This paper contains 24 sections, 1 equation, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Overview of CrashEvent and CrashLLM. Traffic crashes present a significant problem worldwide. We collected traffic crash records from Washington state and utilized textualization to reformulate these records. The tuned LLMs take this input, predict and analyze traffic crashes.
  • Figure 2: Illustration of data pre-processing and text Prompt Design. We illustrate the process from raw data from HSIS, satellite images, crash reports to textualized prompts in CrashEvent dataset. The original data are reorganized into general data and infrastructure data. Additionally, police accident reports, crashworthiness data, driver license data, and state highway department data are reorganized based on events into event data and unit-based data. These four types of data are textualized into approximately 300-word paragraphs aided by ChatGPT. The blue arrow in the figure represents data reorganization, while the beige arrow represents the textualization via LLM.
  • Figure 3: Summary of Model Confusion Metrics. We display the confusion matrix for our top-performing model, LLama-70b, compared to all baseline models. Baseline models tend to predict the category with the highest number of instances (e.g., zero injury, no apparent injury). Our CrashLLM, on the other hand, is not constrained to predict a specific class, validating that the improved accuracies arise from enhanced reasoning capabilities.
  • Figure 4: What-if Situational Analysis. In the three figures, all the number of y-axis are case changes without/with perturbations. The x-axis, zero, represent there are no changes after perturbing the data. The total crash case number in the testing set remains unchanged.
  • Figure 5: Satellite Images Generation by querying Google Map service from HSIS datasets.
  • ...and 5 more figures