Table of Contents
Fetching ...

Unveiling Uniform Shifted Power Law in Stochastic Human and Autonomous Driving Behavior

Wang Chen, Heye Huang, Ke Ma, Hangyu Li, Shixiao Liang, Hang Zhou, Xiaopeng Li

TL;DR

This work identifies a universal shifted power-law tail that governs stochastic driving behavior across human and autonomous vehicles. By decoupling mean dynamics from tail variability and deriving analytical forms for the shifted-power-law distribution, the authors achieve highly accurate fits across global HV/AV datasets with only two parameters $a$ and $k$ (often $a$ fixed), enabling robust tail-risk reproduction and crash-rate validation in simulations. The approach yields an average tail fidelity around RP5 ≈ 0.88 and $R^2$ ≈ 0.97, with crash rates in agent-based simulations aligning with real-world statistics for both HVs and AVs, thereby offering a data-efficient foundation for simulation-based safety assessment and certification. The introduced Risk Index, defined as $|k|$, provides an interpretable link between stochastic driving variability and control difficulty, supporting risk-aware design, benchmarking, and regulatory validation for mixed traffic systems.

Abstract

Accurately simulating rare but safety-critical driving behaviors is essential for the evaluation and certification of autonomous vehicles (AVs). However, current models often fail to reproduce realistic collision rates when calibrated on real-world data, largely due to inadequate representation of long-tailed behavioral distributions. Here, we uncover a simple yet unifying shifted power law that robustly characterizes the stochasticity of both human-driven vehicle (HV) and AV behaviors, especially in the long-tail regime. The model adopts a parsimonious analytical form with only one or two parameters, enabling efficient calibration even under data sparsity. Analyzing large-scale, micro-level trajectory data from global HV and AV datasets, the shifted power law achieves an average R2 of 0.97 and a nearly identical tail distribution, uniformly fits both frequent behaviors and rare safety-critical deviations, significantly outperforming existing Gaussian-based baselines. When integrated into an agent-based traffic simulator, it enables forward-rolling simulations that reproduce realistic crash patterns for both HVs and AVs, achieving rates consistent with real-world statistics and improving the fidelity of safety assessment without post hoc correction. This discovery offers a unified and data-efficient foundation for modeling high-risk behavior and improves the fidelity of simulation-based safety assessments for mixed AV/HV traffic. The shifted power law provides a promising path toward simulation-driven validation and global certification of AV technologies.

Unveiling Uniform Shifted Power Law in Stochastic Human and Autonomous Driving Behavior

TL;DR

This work identifies a universal shifted power-law tail that governs stochastic driving behavior across human and autonomous vehicles. By decoupling mean dynamics from tail variability and deriving analytical forms for the shifted-power-law distribution, the authors achieve highly accurate fits across global HV/AV datasets with only two parameters and (often fixed), enabling robust tail-risk reproduction and crash-rate validation in simulations. The approach yields an average tail fidelity around RP5 ≈ 0.88 and ≈ 0.97, with crash rates in agent-based simulations aligning with real-world statistics for both HVs and AVs, thereby offering a data-efficient foundation for simulation-based safety assessment and certification. The introduced Risk Index, defined as , provides an interpretable link between stochastic driving variability and control difficulty, supporting risk-aware design, benchmarking, and regulatory validation for mixed traffic systems.

Abstract

Accurately simulating rare but safety-critical driving behaviors is essential for the evaluation and certification of autonomous vehicles (AVs). However, current models often fail to reproduce realistic collision rates when calibrated on real-world data, largely due to inadequate representation of long-tailed behavioral distributions. Here, we uncover a simple yet unifying shifted power law that robustly characterizes the stochasticity of both human-driven vehicle (HV) and AV behaviors, especially in the long-tail regime. The model adopts a parsimonious analytical form with only one or two parameters, enabling efficient calibration even under data sparsity. Analyzing large-scale, micro-level trajectory data from global HV and AV datasets, the shifted power law achieves an average R2 of 0.97 and a nearly identical tail distribution, uniformly fits both frequent behaviors and rare safety-critical deviations, significantly outperforming existing Gaussian-based baselines. When integrated into an agent-based traffic simulator, it enables forward-rolling simulations that reproduce realistic crash patterns for both HVs and AVs, achieving rates consistent with real-world statistics and improving the fidelity of safety assessment without post hoc correction. This discovery offers a unified and data-efficient foundation for modeling high-risk behavior and improves the fidelity of simulation-based safety assessments for mixed AV/HV traffic. The shifted power law provides a promising path toward simulation-driven validation and global certification of AV technologies.

Paper Structure

This paper contains 28 sections, 27 equations, 23 figures, 9 tables.

Figures (23)

  • Figure 1: Modeling stochastic driving behavior with the shifted power law.a, Illustration of stochastic driving behavior: the ego vehicle interacts with surrounding vehicles through both lateral and longitudinal directions. b, Decoupling the longitudinal and lateral acceleration to model stochastic driving behavior of an HV/AV. c, The AI model predicts the mean acceleration $\hat{y}_{T+1}$ and standard deviation $\hat{\gamma}_{T+1}$ based on the previous states $x_{1:T}$. Then we focus on the normalized residual distribution $\bar{\sigma} := (y_{T+1} - \hat{y}_{T+1})/\hat{\gamma}_{T+1}$, where $y_{T+1}$ is the observation at time $T+1$. We assume $\bar{\sigma}$ is state-independent and thus the time index is dropped. d, Although classic models adequately approximate driving behavior in safe scenarios, which accounts for the majority of real-world observations, they systematically underestimate rare high-risk long-tail driving behaviors. e, The shifted power law analytically links each possible threshold value $\sigma$ of the normalized residual random variable $\bar{\sigma}$, to the corresponding violation probability $\delta:=\mathbb{P}(|\bar{\sigma}|>|\sigma|)$ enabling interpretable risk modeling and realistic reproduction of safety-critical heavy tails. Together, Figures a–e illustrate how the proposed model bridges microscopic behavioral stochasticity and macroscopic tail-risk characterization, providing a unified framework for quantitative analysis of HV/AV driving behaviors.
  • Figure 2: Validation of the shifted power law across global AV and HV datasets. Datasets span three major regions: United States, including CitySim intersections (signalized and non-signalized) for HV, and multiple AV datasets such as MicroSimACC, Central Ohio ACC, Waymo, Argoverse 2, and CATS, covering highway and diverse driving environments. The third driving environment (i.e., highway) includes freeways and contains two, three, or four lanes in each direction. Europe, including the highD dataset (HV highway) and the OpenACC dataset (AV highway). The OpenACC dataset records the car-following behavior of ego vehicles in the platoon on freeways (the fifth driving environment) and circular tracks (the sixth driving environment). China, including the CitySim-freeC dataset (HV highway).
  • Figure 3: Comparison among the proposed shifted power law, the classic Gaussian distribution, and empirical distribution across AV and HV datasets. Tail accuracy is quantified by the RP5 metric, defined as the ratio of the empirical probability beyond five standard deviations ($|\bar{\sigma}|>5$) to the corresponding model-predicted probability. A value near 1.0 indicates accurate tail estimation. a–c, Distributions of lateral normalized residual of three locations in the highD dataset. d–f, Distributions of longitudinal normalized residual of the Citysim dataset. g–i, Distributions of longitudinal normalized residual of three AV datasets: Argoverse 2, CATS ACC, and MicroSimACC. j–l, Average values of RP5, Log-likelihood, and KL divergence achieved by the shifted power law and Guassian distribution across all datasets.
  • Figure 4: Reflected real-world driving behavior with fixed scale at $a=5$.a–c, Shifted power-law fitting under fixed scale across representative datasets including CitySim-inB (lateral), highD-Location 2 (longitudinal), and CATS ACC (longitudinal). The model retains high fidelity to empirical distributions ($R^2 > 0.95$), confirming stability across different environments. Lon and Lat represent longitudinal and lateral directions, respectively. d–g, Comparison of the derived Risk Index ($|k|$) across vehicle types and scenarios. d, Averaged Risk Index between AVs and HVs. e, Controlled AV experiments showing improved predictability (smaller Risk Index). f, HV driving at intersections and highways, where intersections exhibit higher stochasticity. g, Comparison between HV lateral and longitudinal behaviors, showing higher risk in longitudinal control
  • Figure 5: Crash rate comparison and validation through large-scale simulation.a, Overview of the agent–based simulation framework where each vehicle interacts with at most eight surrounding vehicles. The initial state ($t = 1, \cdots, T$) is sampled from the dataset (HV from the highD dataset and AV from the Waymo dataset), and the movements of all vehicles in the next steps ($t=T+1,\cdots$) are controlled by the model (Gaussian or shifted power law). b, Aggregated crash outcomes and empirical benchmarks, showing that the shifted power law model reproduces real-world crash rates for HVs/AVs. c, Directional breakdown of simulated crashes across longitudinal and lateral dimensions, revealing that approximately 80% of crashes are rear-end impacts, consistent with empirical Risk Index asymmetry and naturalistic driving data.
  • ...and 18 more figures