Table of Contents
Fetching ...

Vision-Augmented On-Track System Identification for Autonomous Racing via Attention-Based Priors and Iterative Neural Correction

Zhiping Wu, Cheng Hu, Yiqin Wang, Lei Xie, Hongye Su

TL;DR

The S4-augmented framework improves parameter extraction accuracy and decreases lateral force RMSE by over 60 by effectively capturing complex vehicle dynamics, demonstrating superior performance compared to conventional neural architectures.

Abstract

Operating autonomous vehicles at the absolute limits of handling requires precise, real-time identification of highly non-linear tire dynamics. However, traditional online optimization methods suffer from "cold-start" initialization failures and struggle to model high-frequency transient dynamics. To address these bottlenecks, this paper proposes a novel vision-augmented, iterative system identification framework. First, a lightweight CNN (MobileNetV3) translates visual road textures into a continuous heuristic friction prior, providing a robust "warm-start" for parameter optimization. Next, a S4 model captures complex temporal dynamic residuals, circumventing the memory and latency limitations of traditional MLPs and RNNs. Finally, a derivative-free Nelder-Mead algorithm iteratively extracts physically interpretable Pacejka tire parameters via a hybrid virtual simulation. Co-simulation in CarSim demonstrates that the lightweight vision backbone reduces friction estimation error by 76.1 using 85 fewer FLOPs, accelerating cold-start convergence by 71.4. Furthermore, the S4-augmented framework improves parameter extraction accuracy and decreases lateral force RMSE by over 60 by effectively capturing complex vehicle dynamics, demonstrating superior performance compared to conventional neural architectures.

Vision-Augmented On-Track System Identification for Autonomous Racing via Attention-Based Priors and Iterative Neural Correction

TL;DR

The S4-augmented framework improves parameter extraction accuracy and decreases lateral force RMSE by over 60 by effectively capturing complex vehicle dynamics, demonstrating superior performance compared to conventional neural architectures.

Abstract

Operating autonomous vehicles at the absolute limits of handling requires precise, real-time identification of highly non-linear tire dynamics. However, traditional online optimization methods suffer from "cold-start" initialization failures and struggle to model high-frequency transient dynamics. To address these bottlenecks, this paper proposes a novel vision-augmented, iterative system identification framework. First, a lightweight CNN (MobileNetV3) translates visual road textures into a continuous heuristic friction prior, providing a robust "warm-start" for parameter optimization. Next, a S4 model captures complex temporal dynamic residuals, circumventing the memory and latency limitations of traditional MLPs and RNNs. Finally, a derivative-free Nelder-Mead algorithm iteratively extracts physically interpretable Pacejka tire parameters via a hybrid virtual simulation. Co-simulation in CarSim demonstrates that the lightweight vision backbone reduces friction estimation error by 76.1 using 85 fewer FLOPs, accelerating cold-start convergence by 71.4. Furthermore, the S4-augmented framework improves parameter extraction accuracy and decreases lateral force RMSE by over 60 by effectively capturing complex vehicle dynamics, demonstrating superior performance compared to conventional neural architectures.
Paper Structure (22 sections, 15 equations, 7 figures, 3 tables)

This paper contains 22 sections, 15 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Schematic representation of the dynamic single-track model.
  • Figure 2: Architecture of the vision-based friction estimation module. The system utilizes a MobileNetV3-Small backbone across four hierarchical stages for efficient feature extraction. The Inverted Residual Block topology incorporates a attention mechanism for texture recalibration. The Probabilistic Friction Mapper executes a classification-to-regression mapping, weighting the Softmax distribution $\mathbf{p}$ by the physical basis vector $\mathcal{B}$ to derive the continuous expected friction coefficient $\hat{\mu}$.
  • Figure 3: Architecture of the model. The diagram illustrates the transition from -initialized continuous state-space to a discrete convolutional representation, enabling parallelized computation of vehicle dynamic residuals.
  • Figure 4: Iterative framework for tire model identification.
  • Figure 5: Visualization of the data acquisition phase in the CarSim environment.
  • ...and 2 more figures