Estimation of Kinematic Motion from Dashcam Footage
Evelyn Zhang, Alex Richardson, Jonathan Sprinkle
TL;DR
This work investigates estimating automotive kinematics from dashcam video by aligning it with CAN bus ground truth data. The authors define target attributes (speed, yaw, lead-car presence, lead distance, lead relative speed) to enable event detection in driving footage. Two encoding pathways (autoencoder and CNN) feed into Baseline, GRU, and Transformer predictors; the study explores hyperparameters within practical hardware limits. Results indicate promise for speed and yaw prediction but highlight data-size limitations and lead-car related challenges, providing a blueprint for reproducible data collection using open-source tools.
Abstract
The goal of this paper is to explore the accuracy of dashcam footage to predict the actual kinematic motion of a car-like vehicle. Our approach uses ground truth information from the vehicle's on-board data stream, through the controller area network, and a time-synchronized dashboard camera, mounted to a consumer-grade vehicle, for 18 hours of footage and driving. The contributions of the paper include neural network models that allow us to quantify the accuracy of predicting the vehicle speed and yaw, as well as the presence of a lead vehicle, and its relative distance and speed. In addition, the paper describes how other researchers can gather their own data to perform similar experiments, using open-source tools and off-the-shelf technology.
