Table of Contents
Fetching ...

Driving-RAG: Driving Scenarios Embedding, Search, and RAG Applications

Cheng Chang, Jingwei Ge, Jiazhe Guo, Zelin Guo, Binghong Jiang, Li Li

TL;DR

This work tackles scalable, high-fidelity embedding and retrieval of driving scenarios to empower Retrieval-Augmented Generation (RAG) for autonomous driving. It introduces Driving-RAG, a three-part framework consisting of an aligned scenario embedding model (RGCN+Transformer) trained to match Graph-DTW distances, a KDE-informed HNSW-TSD search that produces a typical scenario data subset and fast nearest-neighbor retrieval, and a retrieval reorganization module that uses graph relations to produce coherent, task-relevant references for LLM augmentation. Experiments on CitySim and INTERACTION show that embeddings align with the driving-distance metric, HNSW-TSD provides substantial speedups over traditional vector search, and LLM-based trajectory planning benefits from both retrieval and relation-based reorganization, improving metrics such as ADE and goal-confusion. The approach is efficient at scale (approximately 30 ms per query at $10^5$ samples) and generalizable to other RAG-enabled tasks beyond driving, enhancing online decision-making and offline simulation workflows.

Abstract

Driving scenario data play an increasingly vital role in the development of intelligent vehicles and autonomous driving. Accurate and efficient scenario data search is critical for both online vehicle decision-making and planning, and offline scenario generation and simulations, as it allows for leveraging the scenario experiences to improve the overall performance. Especially with the application of large language models (LLMs) and Retrieval-Augmented-Generation (RAG) systems in autonomous driving, urgent requirements are put forward. In this paper, we introduce the Driving-RAG framework to address the challenges of efficient scenario data embedding, search, and applications for RAG systems. Our embedding model aligns fundamental scenario information and scenario distance metrics in the vector space. The typical scenario sampling method combined with hierarchical navigable small world can perform efficient scenario vector search to achieve high efficiency without sacrificing accuracy. In addition, the reorganization mechanism by graph knowledge enhances the relevance to the prompt scenarios and augment LLM generation. We demonstrate the effectiveness of the proposed framework on typical trajectory planning task for complex interactive scenarios such as ramps and intersections, showcasing its advantages for RAG applications.

Driving-RAG: Driving Scenarios Embedding, Search, and RAG Applications

TL;DR

This work tackles scalable, high-fidelity embedding and retrieval of driving scenarios to empower Retrieval-Augmented Generation (RAG) for autonomous driving. It introduces Driving-RAG, a three-part framework consisting of an aligned scenario embedding model (RGCN+Transformer) trained to match Graph-DTW distances, a KDE-informed HNSW-TSD search that produces a typical scenario data subset and fast nearest-neighbor retrieval, and a retrieval reorganization module that uses graph relations to produce coherent, task-relevant references for LLM augmentation. Experiments on CitySim and INTERACTION show that embeddings align with the driving-distance metric, HNSW-TSD provides substantial speedups over traditional vector search, and LLM-based trajectory planning benefits from both retrieval and relation-based reorganization, improving metrics such as ADE and goal-confusion. The approach is efficient at scale (approximately 30 ms per query at samples) and generalizable to other RAG-enabled tasks beyond driving, enhancing online decision-making and offline simulation workflows.

Abstract

Driving scenario data play an increasingly vital role in the development of intelligent vehicles and autonomous driving. Accurate and efficient scenario data search is critical for both online vehicle decision-making and planning, and offline scenario generation and simulations, as it allows for leveraging the scenario experiences to improve the overall performance. Especially with the application of large language models (LLMs) and Retrieval-Augmented-Generation (RAG) systems in autonomous driving, urgent requirements are put forward. In this paper, we introduce the Driving-RAG framework to address the challenges of efficient scenario data embedding, search, and applications for RAG systems. Our embedding model aligns fundamental scenario information and scenario distance metrics in the vector space. The typical scenario sampling method combined with hierarchical navigable small world can perform efficient scenario vector search to achieve high efficiency without sacrificing accuracy. In addition, the reorganization mechanism by graph knowledge enhances the relevance to the prompt scenarios and augment LLM generation. We demonstrate the effectiveness of the proposed framework on typical trajectory planning task for complex interactive scenarios such as ramps and intersections, showcasing its advantages for RAG applications.

Paper Structure

This paper contains 11 sections, 7 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: The challenges and corresponding solutions in Driving-RAG framework.
  • Figure 2: (a) The proposed scenario embedding model and training process. (b) The backbone of the scenario graphs encoder.
  • Figure 3: The driving scenario embeddings similarity search process with multi-interactions and HNSW-TSD algorithm.
  • Figure 4: The prompt texts for LLM for typical planning task.
  • Figure 5: The response texts from LLM for typical planning task.
  • ...and 6 more figures