Table of Contents
Fetching ...

Towards Scenario- and Capability-Driven Dataset Development and Evaluation: An Approach in the Context of Mapless Automated Driving

Felix Grün, Marcus Nolte, Markus Maurer

TL;DR

This paper tackles the data bottleneck in mapless automated driving by introducing a scenario- and capability-driven dataset development framework grounded in ISO 21448 (SOTIF) and ISO/TR 4804 Annex B. It uses capability graphs to translate operational driving scenarios into concrete dataset requirements and demonstrates the approach with two driving scenarios, followed by a comprehensive review of existing lane-detection datasets. The analysis highlights gaps in labeling granularity, driving-direction information, and occlusion handling, demonstrating that no single dataset fully supports mapless driving tasks such as complex lane-changing. The proposed framework provides a structured method for deriving dataset requirements and enables meaningful cross-dataset comparisons to guide future data collection and labeling efforts, ultimately supporting safer and more capable perception systems for mapless driving.

Abstract

The foundational role of datasets in defining the capabilities of deep learning models has led to their rapid proliferation. At the same time, published research focusing on the process of dataset development for environment perception in automated driving has been scarce, thereby reducing the applicability of openly available datasets and impeding the development of effective environment perception systems. Sensor-based, mapless automated driving is one of the contexts where this limitation is evident. While leveraging real-time sensor data, instead of pre-defined HD maps promises enhanced adaptability and safety by effectively navigating unexpected environmental changes, it also increases the demands on the scope and complexity of the information provided by the perception system. To address these challenges, we propose a scenario- and capability-based approach to dataset development. Grounded in the principles of ISO 21448 (safety of the intended functionality, SOTIF), extended by ISO/TR 4804, our approach facilitates the structured derivation of dataset requirements. This not only aids in the development of meaningful new datasets but also enables the effective comparison of existing ones. Applying this methodology to a broad range of existing lane detection datasets, we identify significant limitations in current datasets, particularly in terms of real-world applicability, a lack of labeling of critical features, and an absence of comprehensive information for complex driving maneuvers.

Towards Scenario- and Capability-Driven Dataset Development and Evaluation: An Approach in the Context of Mapless Automated Driving

TL;DR

This paper tackles the data bottleneck in mapless automated driving by introducing a scenario- and capability-driven dataset development framework grounded in ISO 21448 (SOTIF) and ISO/TR 4804 Annex B. It uses capability graphs to translate operational driving scenarios into concrete dataset requirements and demonstrates the approach with two driving scenarios, followed by a comprehensive review of existing lane-detection datasets. The analysis highlights gaps in labeling granularity, driving-direction information, and occlusion handling, demonstrating that no single dataset fully supports mapless driving tasks such as complex lane-changing. The proposed framework provides a structured method for deriving dataset requirements and enables meaningful cross-dataset comparisons to guide future data collection and labeling efforts, ultimately supporting safer and more capable perception systems for mapless driving.

Abstract

The foundational role of datasets in defining the capabilities of deep learning models has led to their rapid proliferation. At the same time, published research focusing on the process of dataset development for environment perception in automated driving has been scarce, thereby reducing the applicability of openly available datasets and impeding the development of effective environment perception systems. Sensor-based, mapless automated driving is one of the contexts where this limitation is evident. While leveraging real-time sensor data, instead of pre-defined HD maps promises enhanced adaptability and safety by effectively navigating unexpected environmental changes, it also increases the demands on the scope and complexity of the information provided by the perception system. To address these challenges, we propose a scenario- and capability-based approach to dataset development. Grounded in the principles of ISO 21448 (safety of the intended functionality, SOTIF), extended by ISO/TR 4804, our approach facilitates the structured derivation of dataset requirements. This not only aids in the development of meaningful new datasets but also enables the effective comparison of existing ones. Applying this methodology to a broad range of existing lane detection datasets, we identify significant limitations in current datasets, particularly in terms of real-world applicability, a lack of labeling of critical features, and an absence of comprehensive information for complex driving maneuvers.
Paper Structure (21 sections, 2 figures, 1 table)

This paper contains 21 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Visualization of the example scenarios. The red automated vehicle is approaching a yellow mail van blocking its lane. To continue its mission, it must perform a lane-change maneuver as visualized by the black example trajectory. Different infrastructure elements are highlighted as follows: The ego-lane of the automated vehicle is shown in blue, the adjacent lane with unknown driving direction is shown in red, broken white lane boundaries are marked in green, solid white lane boundaries are marked in yellow, and unmarked lane boundaries are marked in orange.
  • Figure 2: Visualization of a relevant subsection of the capability graph for the example scenarios with the overall mission goal in gray, visible external behavior in orange, high-level capabilities in blue, low-level capabilities in green, and relevant infrastructure elements in red.