Table of Contents
Fetching ...

OpenLKA: an open dataset of lane keeping assist from market autonomous vehicles

Yuhang Wang, Abdulaziz Alhuraish, Shengming Yuan, Shuyi Wang, Hao Zhou

TL;DR

OpenLKA addresses a critical data gap in Lane Keeping Assist by releasing a large-scale, multimodal open dataset collected from market-available LKA-equipped vehicles in Tampa. The dataset integrates CAN bus data, high-fidelity dash-cam video, perception outputs, and LVLM-based environmental annotations, augmented with human-driver data to support cross-domain analyses. Key findings show LKA is robust under normal conditions but exhibits pronounced weaknesses in perception, planning, and control during curved segments and degraded lane markings, with a strong dependence on road geometry and speed. The work also introduces iLKA, a human-like planning framework leveraging vision-language models and chain-of-thought prompts, and highlights practical implications for road design, rural safety, and AI-assisted LKA development, including cooperative perception architectures.

Abstract

The Lane Keeping Assist (LKA) system has become a standard feature in recent car models. While marketed as providing auto-steering capabilities, the system's operational characteristics and safety performance remain underexplored, primarily due to a lack of real-world testing and comprehensive data. To fill this gap, we extensively tested mainstream LKA systems from leading U.S. automakers in Tampa, Florida. Using an innovative method, we collected a comprehensive dataset that includes full Controller Area Network (CAN) messages with LKA attributes, as well as video, perception, and lateral trajectory data from a high-quality front-facing camera equipped with advanced vision detection and trajectory planning algorithms. Our tests spanned diverse, challenging conditions, including complex road geometry, adverse weather, degraded lane markings, and their combinations. A vision language model (VLM) further annotated the videos to capture weather, lighting, and traffic features. Based on this dataset, we present an empirical overview of LKA's operational features and safety performance. Key findings indicate: (i) LKA is vulnerable to faint markings and low pavement contrast; (ii) it struggles in lane transitions (merges, diverges, intersections), often causing unintended departures or disengagements; (iii) steering torque limitations lead to frequent deviations on sharp turns, posing safety risks; and (iv) LKA systems consistently maintain rigid lane-centering, lacking adaptability on tight curves or near large vehicles such as trucks. We conclude by demonstrating how this dataset can guide both infrastructure planning and self-driving technology. In view of LKA's limitations, we recommend improvements in road geometry and pavement maintenance. Additionally, we illustrate how the dataset supports the development of human-like LKA systems via VLM fine-tuning and Chain of Thought reasoning.

OpenLKA: an open dataset of lane keeping assist from market autonomous vehicles

TL;DR

OpenLKA addresses a critical data gap in Lane Keeping Assist by releasing a large-scale, multimodal open dataset collected from market-available LKA-equipped vehicles in Tampa. The dataset integrates CAN bus data, high-fidelity dash-cam video, perception outputs, and LVLM-based environmental annotations, augmented with human-driver data to support cross-domain analyses. Key findings show LKA is robust under normal conditions but exhibits pronounced weaknesses in perception, planning, and control during curved segments and degraded lane markings, with a strong dependence on road geometry and speed. The work also introduces iLKA, a human-like planning framework leveraging vision-language models and chain-of-thought prompts, and highlights practical implications for road design, rural safety, and AI-assisted LKA development, including cooperative perception architectures.

Abstract

The Lane Keeping Assist (LKA) system has become a standard feature in recent car models. While marketed as providing auto-steering capabilities, the system's operational characteristics and safety performance remain underexplored, primarily due to a lack of real-world testing and comprehensive data. To fill this gap, we extensively tested mainstream LKA systems from leading U.S. automakers in Tampa, Florida. Using an innovative method, we collected a comprehensive dataset that includes full Controller Area Network (CAN) messages with LKA attributes, as well as video, perception, and lateral trajectory data from a high-quality front-facing camera equipped with advanced vision detection and trajectory planning algorithms. Our tests spanned diverse, challenging conditions, including complex road geometry, adverse weather, degraded lane markings, and their combinations. A vision language model (VLM) further annotated the videos to capture weather, lighting, and traffic features. Based on this dataset, we present an empirical overview of LKA's operational features and safety performance. Key findings indicate: (i) LKA is vulnerable to faint markings and low pavement contrast; (ii) it struggles in lane transitions (merges, diverges, intersections), often causing unintended departures or disengagements; (iii) steering torque limitations lead to frequent deviations on sharp turns, posing safety risks; and (iv) LKA systems consistently maintain rigid lane-centering, lacking adaptability on tight curves or near large vehicles such as trucks. We conclude by demonstrating how this dataset can guide both infrastructure planning and self-driving technology. In view of LKA's limitations, we recommend improvements in road geometry and pavement maintenance. Additionally, we illustrate how the dataset supports the development of human-like LKA systems via VLM fine-tuning and Chain of Thought reasoning.
Paper Structure (25 sections, 8 equations, 31 figures, 3 tables)

This paper contains 25 sections, 8 equations, 31 figures, 3 tables.

Figures (31)

  • Figure 1: OpenLKA: A comprehensive framework for LKA analysis and enhancement.
  • Figure 2: LKA Dataset Overview
  • Figure 3: Human Dataset from Drivers Across the World
  • Figure 4: Comma 3x, dash cam for LKA data collection, placed at the front windshield of the vehicle.
  • Figure 5: LKA data collection sites
  • ...and 26 more figures