Table of Contents
Fetching ...

Structuring Automotive Data for Systems Engineering: A Taxonomy-Based Approach

Carl Philipp Hohl, Philipp Reis, Tobias Schürmann, Stefan Otten, Eric Sax

TL;DR

This work addresses the fragmentation of automotive data by proposing an extensible taxonomy that classifies data by origin (sources) and use (applications). It synthesizes multiple perspectives—ontology, signal descriptions, process-oriented views, scenario focus, and data-centric approaches—to justify a two-axis framework that supports data governance and reuse. Through a systematic literature review of $21{,}467$ articles from 2015–2024 and a refinement to $87$ relevant papers, the authors map data to Source and Application facets, revealing real-world data dominance and gaps in data for requirements engineering. The taxonomy offers a practical foundation for labeling datasets, guiding data collection, and informing data-driven decision-making in automotive system engineering, with potential to reduce silos and enhance lifecycle-wide data utilization.

Abstract

Vehicle data is essential for advancing data-driven development throughout the automotive lifecycle, including requirements engineering, design, verification, and validation, and post-deployment optimization. Developers currently collect data in a decentralized and fragmented manner across simulations, test benches, and real-world driving, resulting in data silos, inconsistent formats, and limited interoperability. This leads to redundant efforts, inefficient integration, and suboptimal use of data. This fragmentation results in data silos, inconsistent storage structures, and limited interoperability, leading to redundant data collection, inefficient integration, and suboptimal application. To address these challenges, this article presents a structured literature review and develops an inductive taxonomy for automotive data. This taxonomy categorizes data according to its sources and applications, improving data accessibility and utilization. The analysis reveals a growing emphasis on real-world driving and machine learning applications while highlighting a critical gap in data availability for requirements engineering. By providing a systematic framework for structuring automotive data, this research contributes to more efficient data management and improved decision-making in the automotive industry.

Structuring Automotive Data for Systems Engineering: A Taxonomy-Based Approach

TL;DR

This work addresses the fragmentation of automotive data by proposing an extensible taxonomy that classifies data by origin (sources) and use (applications). It synthesizes multiple perspectives—ontology, signal descriptions, process-oriented views, scenario focus, and data-centric approaches—to justify a two-axis framework that supports data governance and reuse. Through a systematic literature review of articles from 2015–2024 and a refinement to relevant papers, the authors map data to Source and Application facets, revealing real-world data dominance and gaps in data for requirements engineering. The taxonomy offers a practical foundation for labeling datasets, guiding data collection, and informing data-driven decision-making in automotive system engineering, with potential to reduce silos and enhance lifecycle-wide data utilization.

Abstract

Vehicle data is essential for advancing data-driven development throughout the automotive lifecycle, including requirements engineering, design, verification, and validation, and post-deployment optimization. Developers currently collect data in a decentralized and fragmented manner across simulations, test benches, and real-world driving, resulting in data silos, inconsistent formats, and limited interoperability. This leads to redundant efforts, inefficient integration, and suboptimal use of data. This fragmentation results in data silos, inconsistent storage structures, and limited interoperability, leading to redundant data collection, inefficient integration, and suboptimal application. To address these challenges, this article presents a structured literature review and develops an inductive taxonomy for automotive data. This taxonomy categorizes data according to its sources and applications, improving data accessibility and utilization. The analysis reveals a growing emphasis on real-world driving and machine learning applications while highlighting a critical gap in data availability for requirements engineering. By providing a systematic framework for structuring automotive data, this research contributes to more efficient data management and improved decision-making in the automotive industry.

Paper Structure

This paper contains 18 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Illustration of our research methodology for reviewing literature leading from 21467 articles to a subset of 87 publications relevant to our research
  • Figure 2: Proposed taxonomy for vehicle real-world data collection derived from the comprehensive literature review illustrated in Fig. \ref{['fig:MethPaper']}