Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test
Xuan Cai, Xuesong Bai, Zhiyong Cui, Danmu Xie, Daocheng Fu, Haiyang Yu, Yilong Ren
TL;DR
Text2Scenario presents an automated framework that converts natural-language AD test descriptions into executable OpenScenario DSL via a hierarchical scenario repository and an LLM-driven parsing pipeline. It employs a five-stage prompt-engineering process to extract and structure scenario components, followed by a priority-based DSL assembly to generate CARLA-compatible tests. Across 92 texts, 368 scenarios were produced, revealing 533 safety violations, with GPT-4 achieving the best parsing accuracy (≈92%), high semantic fidelity (ICC ≈ 0.92) and solid driving rationality (ICC ≈ 0.76). The work demonstrates substantial reductions in manual scenario scripting time and lays groundwork for end-to-end DSL generation and richer, high-risk testing in ADS evaluation.
Abstract
Autonomous driving (AD) testing constitutes a critical methodology for assessing performance benchmarks prior to product deployment. The creation of segmented scenarios within a simulated environment is acknowledged as a robust and effective strategy; however, the process of tailoring these scenarios often necessitates laborious and time-consuming manual efforts, thereby hindering the development and implementation of AD technologies. In response to this challenge, we introduce Text2Scenario, a framework that leverages a Large Language Model (LLM) to autonomously generate simulation test scenarios that closely align with user specifications, derived from their natural language inputs. Specifically, an LLM, equipped with a meticulously engineered input prompt scheme functions as a text parser for test scenario descriptions, extracting from a hierarchically organized scenario repository the components that most accurately reflect the user's preferences. Subsequently, by exploiting the precedence of scenario components, the process involves sequentially matching and linking scenario representations within a Domain Specific Language corpus, ultimately fabricating executable test scenarios. The experimental results demonstrate that such prompt engineering can meticulously extract the nuanced details of scenario elements embedded within various descriptive formats, with the majority of generated scenarios aligning closely with the user's initial expectations, allowing for the efficient and precise evaluation of diverse AD stacks void of the labor-intensive need for manual scenario configuration. Project page: https://caixxuan.github.io/Text2Scenario.GitHub.io.
