Table of Contents
Fetching ...

Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering

Chih-Hong Cheng, Brian Hsuan-Cheng Liao, Adam Molin, Hasan Esen

TL;DR

This work proposes workflow-level design principles for trustworthy GenAI integration and demonstrates them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing.

Abstract

The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established verification practices. We propose workflow-level design principles for trustworthy GenAI integration and demonstrate them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing. First, we show that monolithic ("big-bang") prompting misses critical changes in large specifications, while section-wise decomposition with diversity sampling and lightweight NLP sanity checks improves completeness and correctness. Then, we propagate requirement deltas into SysML v2 models and validate updates via compilation and static analysis. Additionally, we ensure traceable regression testing by generating test cases through explicit mappings from specification variables to architectural ports and states, providing practical safeguards for GenAI used in safety-critical automotive engineering.

Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering

TL;DR

This work proposes workflow-level design principles for trustworthy GenAI integration and demonstrates them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing.

Abstract

The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established verification practices. We propose workflow-level design principles for trustworthy GenAI integration and demonstrate them in an end-to-end automotive pipeline, from requirement delta identification to SysML v2 architecture update and re-testing. First, we show that monolithic ("big-bang") prompting misses critical changes in large specifications, while section-wise decomposition with diversity sampling and lightweight NLP sanity checks improves completeness and correctness. Then, we propagate requirement deltas into SysML v2 models and validate updates via compilation and static analysis. Additionally, we ensure traceable regression testing by generating test cases through explicit mappings from specification variables to architectural ports and states, providing practical safeguards for GenAI used in safety-critical automotive engineering.
Paper Structure (15 sections, 1 equation, 4 figures)

This paper contains 15 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: GenAI-assisted workflow from requirement updates to architecture changes and verification.
  • Figure 2: A revised workflow for comparing complicated documents while increasing trustworthiness.
  • Figure 3: Precision and recall per model setting in retrieving relevant sections across ASPICE v3.1 and v4. Union refers to combining the predictions of Qwen3:32B, Nemotron3:30B, and GPT-OSS:20B.
  • Figure 4: Last-mile syntax drift in requirement allocation: :: replaced by ., breaking compilation.