Table of Contents
Fetching ...

A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research

Seongjin Choi, Zhixiong Jin, Seung Woo Ham, Jiwon Kim, Lijun Sun

TL;DR

This paper surveys the state-of-the-art in Deep Generative Models (DGMs) and their transportation applications, contrasting DGMs with traditional discriminative approaches and detailing how they learn and sample from complex, high-dimensional data distributions. It provides a rigorous foundation for DGMs (VAEs, GANs, Normalizing Flows, Score-based and Diffusion models), a systematic transportation-focused literature review, and two practical tutorials (synthetic household travel data and highway speed contours) with open-source code. The authors discuss core challenges—evaluation, dynamic data shifts, data privacy, bias, trustworthy AI, and causality—and outline opportunities for robust, interpretable, and policy-relevant DGMs in smart mobility. By linking foundational theory, empirical transportation studies, and hands-on tutorials, the work aims to catalyze broader adoption and responsible advancement of generative methods in transportation research.

Abstract

Deep Generative Models (DGMs) have rapidly advanced in recent years, becoming essential tools in various fields due to their ability to learn complex data distributions and generate synthetic data. Their importance in transportation research is increasingly recognized, particularly for applications like traffic data generation, prediction, and feature extraction. This paper offers a comprehensive introduction and tutorial on DGMs, with a focus on their applications in transportation. It begins with an overview of generative models, followed by detailed explanations of fundamental models, a systematic review of the literature, and practical tutorial code to aid implementation. The paper also discusses current challenges and opportunities, highlighting how these models can be effectively utilized and further developed in transportation research. This paper serves as a valuable reference, guiding researchers and practitioners from foundational knowledge to advanced applications of DGMs in transportation research.

A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research

TL;DR

This paper surveys the state-of-the-art in Deep Generative Models (DGMs) and their transportation applications, contrasting DGMs with traditional discriminative approaches and detailing how they learn and sample from complex, high-dimensional data distributions. It provides a rigorous foundation for DGMs (VAEs, GANs, Normalizing Flows, Score-based and Diffusion models), a systematic transportation-focused literature review, and two practical tutorials (synthetic household travel data and highway speed contours) with open-source code. The authors discuss core challenges—evaluation, dynamic data shifts, data privacy, bias, trustworthy AI, and causality—and outline opportunities for robust, interpretable, and policy-relevant DGMs in smart mobility. By linking foundational theory, empirical transportation studies, and hands-on tutorials, the work aims to catalyze broader adoption and responsible advancement of generative methods in transportation research.

Abstract

Deep Generative Models (DGMs) have rapidly advanced in recent years, becoming essential tools in various fields due to their ability to learn complex data distributions and generate synthetic data. Their importance in transportation research is increasingly recognized, particularly for applications like traffic data generation, prediction, and feature extraction. This paper offers a comprehensive introduction and tutorial on DGMs, with a focus on their applications in transportation. It begins with an overview of generative models, followed by detailed explanations of fundamental models, a systematic review of the literature, and practical tutorial code to aid implementation. The paper also discusses current challenges and opportunities, highlighting how these models can be effectively utilized and further developed in transportation research. This paper serves as a valuable reference, guiding researchers and practitioners from foundational knowledge to advanced applications of DGMs in transportation research.

Paper Structure

This paper contains 100 sections, 109 equations, 22 figures, 12 tables.

Figures (22)

  • Figure 1: Schematic overview of AE and VAE. Here, $\mathbf{x}$ is the real data and $\mathbf{x}'$ denotes the generated data, and $\mathbf{z}$ represents the latent vector.
  • Figure 2: Schematic overview of Generative Adversarial Network (GAN). Here, $\mathbf{x}$ is the real data, $\mathbf{x}'$ denotes the generated data, and $\mathbf{z}$ represents the random noise.
  • Figure 3: Schematic overview of normalizing flow. Here, $\mathbf{z}_0$ is a simple, known distribution (such as a standard Gaussian), $\mathbf{z}_i$ represents an intermediate distribution, and $\mathbf{z}_K$ denotes the target distribution.
  • Figure 4: Schematic overview of score-based generative model. Here, $\mathbf{x}$ denotes a data sample from the underlying distribution, and $\nabla_{\mathbf{x}} \log p(\mathbf{x})$ represents the score function.
  • Figure 5: Schematic overview of diffusion model. Here, $\mathbf{x}_0$ represents the original data, $\mathbf{x}_t$ represents the intermediate noisy states, and $\mathbf{x}_T$ represents the noise data (e.g., Gaussian distribution).
  • ...and 17 more figures