Table of Contents
Fetching ...

Generative AI for Autonomous Driving: A Review

Katharina Winter, Abhishek Vivekanandan, Rupert Polley, Yinzhe Shen, Christian Schlauch, Mohamed-Khalil Bouzidi, Bojan Derajic, Natalie Grabowsky, Annajoyce Mariani, Dennis Rochau, Giovanni Lucente, Harsh Yadav, Firas Mualla, Adam Molin, Sebastian Bernhard, Christian Wirth, Ömer Şahin Taş, Nadja Klein, Fabian B. Flohr, Hanno Gottschalk

TL;DR

Generative AI for Autonomous Driving surveys a broad set of GenAI models and hybrid methods applied to driving tasks, including static map and dynamic scenario generation, trajectory forecasting, and motion planning. It emphasizes the complementary roles of diffusion models, VAEs, GANs, normalizing flows, energy-based models, and transformers, and discusses conditioning, uncertainty quantification, and classical learning strategies to stabilize training. The review also covers the autonomous driving stack, data modalities, world models, and end-to-end driving, highlighting open challenges in safety, interpretability, and real-time feasibility, while offering practical recommendations and directions for future research. The work underscores the potential of hybrid GenAI-plus-traditional approaches and LLM-enabled planning, but calls for robust evaluation, domain-gap assessment, and efficient deployment on edge devices to translate advances into market-ready AD systems.

Abstract

Generative AI (GenAI) is rapidly advancing the field of Autonomous Driving (AD), extending beyond traditional applications in text, image, and video generation. We explore how generative models can enhance automotive tasks, such as static map creation, dynamic scenario generation, trajectory forecasting, and vehicle motion planning. By examining multiple generative approaches ranging from Variational Autoencoder (VAEs) over Generative Adversarial Networks (GANs) and Invertible Neural Networks (INNs) to Generative Transformers (GTs) and Diffusion Models (DMs), we highlight and compare their capabilities and limitations for AD-specific applications. Additionally, we discuss hybrid methods integrating conventional techniques with generative approaches, and emphasize their improved adaptability and robustness. We also identify relevant datasets and outline open research questions to guide future developments in GenAI. Finally, we discuss three core challenges: safety, interpretability, and realtime capabilities, and present recommendations for image generation, dynamic scenario generation, and planning.

Generative AI for Autonomous Driving: A Review

TL;DR

Generative AI for Autonomous Driving surveys a broad set of GenAI models and hybrid methods applied to driving tasks, including static map and dynamic scenario generation, trajectory forecasting, and motion planning. It emphasizes the complementary roles of diffusion models, VAEs, GANs, normalizing flows, energy-based models, and transformers, and discusses conditioning, uncertainty quantification, and classical learning strategies to stabilize training. The review also covers the autonomous driving stack, data modalities, world models, and end-to-end driving, highlighting open challenges in safety, interpretability, and real-time feasibility, while offering practical recommendations and directions for future research. The work underscores the potential of hybrid GenAI-plus-traditional approaches and LLM-enabled planning, but calls for robust evaluation, domain-gap assessment, and efficient deployment on edge devices to translate advances into market-ready AD systems.

Abstract

Generative AI (GenAI) is rapidly advancing the field of Autonomous Driving (AD), extending beyond traditional applications in text, image, and video generation. We explore how generative models can enhance automotive tasks, such as static map creation, dynamic scenario generation, trajectory forecasting, and vehicle motion planning. By examining multiple generative approaches ranging from Variational Autoencoder (VAEs) over Generative Adversarial Networks (GANs) and Invertible Neural Networks (INNs) to Generative Transformers (GTs) and Diffusion Models (DMs), we highlight and compare their capabilities and limitations for AD-specific applications. Additionally, we discuss hybrid methods integrating conventional techniques with generative approaches, and emphasize their improved adaptability and robustness. We also identify relevant datasets and outline open research questions to guide future developments in GenAI. Finally, we discuss three core challenges: safety, interpretability, and realtime capabilities, and present recommendations for image generation, dynamic scenario generation, and planning.

Paper Structure

This paper contains 60 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: The ad stack.
  • Figure 2: Comparison of prediction types in a traffic intersection scenario. Blue and orange colors represent different prediction modes. The subfigure (a) shows marginal prediction for the red vehicle, (b) illustrates the conditional prediction of the red vehicle dependent on the ego vehicle (shown in blue), and (c) shows joint prediction, where blue and orange colors represent an entire scene independently.
  • Figure 3: Demonstration of scenario generation starting from an initial position and a tag. Images were taken from ding_realgen_2024.
  • Figure 4: Demonstration of text-prompt to scenario generation: image was taken from tan_language_2023. Prompt: The scene is very dense. There are only vehicles on the left side of the center car. Most cars are moving in fast speed. The ego-vehicle turns right.