Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

Yongqi Zhao; Wenbo Xiao; Tomislav Mihalj; Jia Hu; Arno Eichberger

Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

Yongqi Zhao, Wenbo Xiao, Tomislav Mihalj, Jia Hu, Arno Eichberger

TL;DR

Chat2Scenario presents a GPT-4–driven framework to extract concrete driving scenarios from naturalistic data for ADS validation, addressing data accessibility and preprocessing bottlenecks. The methodology couples a Streamlit web app with an LLM-based scenario understanding module, a rule-based activity/position classifier, and a criticality analysis to filter and rank scenarios, exporting them to ASAM OpenSCENARIO and IPG CarMaker formats. It leverages the highD dataset and demonstrates qualitative extraction of typical scenarios (following, cut-in, cut-out) with reconstruction in Esmini and CarMaker, and provides quantitative metrics on track #36 showing robust performance, especially in cut-in/cut-out cases. The approach enables efficient, scalable scenario search and open-source tooling for ADS virtual testing and validation, with future work aimed at diversifying datasets and refining criticality measures.

Abstract

The advent of Large Language Models (LLM) provides new insights to validate Automated Driving Systems (ADS). In the herein-introduced work, a novel approach to extracting scenarios from naturalistic driving datasets is presented. A framework called Chat2Scenario is proposed leveraging the advanced Natural Language Processing (NLP) capabilities of LLM to understand and identify different driving scenarios. By inputting descriptive texts of driving conditions and specifying the criticality metric thresholds, the framework efficiently searches for desired scenarios and converts them into ASAM OpenSCENARIO and IPG CarMaker text files. This methodology streamlines the scenario extraction process and enhances efficiency. Simulations are executed to validate the efficiency of the approach. The framework is presented based on a user-friendly web app and is accessible via the following link: https://github.com/ftgTUGraz/Chat2Scenario.

Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

TL;DR

Abstract

Paper Structure (20 sections, 4 equations, 9 figures, 4 tables)

This paper contains 20 sections, 4 equations, 9 figures, 4 tables.

Introduction
Terminology and Dataset Format
Definition of Activity and Event
Dataset Format
Methodology
Chat2Scenario Web App
Scenario Understanding
Scenario Classification Model
Prompt Engineering of LLM
Scenario Searching
Activity Identification
Position Identification
Criticality Analysis
Simulatable Format Generation
ASAM OpenSCENARIO
...and 5 more sections

Figures (9)

Figure 1: Overview of Chat2Scenario web app.
Figure 2: Visualization of Event and Activity: blue arrow represents the vehicle trajectory elrofai2018scenariokrajewski2018highd.
Figure 3: Schematic overview of the Chat2Scenario framework operations.
Figure 4: Scenario classification model for highD traffic.
Figure 5: Visualization of scenarios: dashed and solid lines represent target and ego vehicle trajectories respectively.
...and 4 more figures

Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

TL;DR

Abstract

Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

Authors

TL;DR

Abstract

Table of Contents

Figures (9)