Table of Contents
Fetching ...

Synthetic Data Generation for Minimum-Exposure Navigation in a Time-Varying Environment using Generative AI Models

Nachiket U. Bapat, Randy C. Paffenroth, Raghvendra V. Cowlagi

TL;DR

The paper tackles synthetic data generation for autonomous navigation in a time-varying threat field when real observations are scarce. It introduces the split variational recurrent neural network (S-VRNN), which fuses small real data with a dynamics-derived noiseless support by splitting the latent space into two subspaces, $\kappa_1$ and $\kappa_2$, to separate data-specific noise from dynamics-driven structure. Empirical results show that S-VRNN yields synthetic samples whose distribution closely matches the real data, outperforming both a purely data-driven VRNN and a split-VAE, particularly in low-data regimes, and reflecting the prescribed dynamics via a Hurwitz matrix $A$. This dynamics-aware approach reduces the reality gap for synthetic data, facilitating faster validation, planning, and digital twin development in engineering contexts with limited observations.

Abstract

We study the problem of synthetic generation of samples of environmental features for autonomous vehicle navigation. These features are described by a spatiotemporally varying scalar field that we refer to as a threat field. The threat field is known to have some underlying dynamics subject to process noise. Some "real-world" data of observations of various threat fields are also available. The assumption is that the volume of ``real-world'' data is relatively small. The objective is to synthesize samples that are statistically similar to the data. The proposed solution is a generative artificial intelligence model that we refer to as a split variational recurrent neural network (S-VRNN). The S-VRNN merges the capabilities of a variational autoencoder, which is a widely used generative model, and a recurrent neural network, which is used to learn temporal dependencies in data. The main innovation in this work is that we split the latent space of the S-VRNN into two subspaces. The latent variables in one subspace are learned using the ``real-world'' data, whereas those in the other subspace are learned using the data as well as the known underlying system dynamics. Through numerical experiments we demonstrate that the proposed S-VRNN can synthesize data that are statistically similar to the training data even in the case of very small volume of ``real-world'' training data.

Synthetic Data Generation for Minimum-Exposure Navigation in a Time-Varying Environment using Generative AI Models

TL;DR

The paper tackles synthetic data generation for autonomous navigation in a time-varying threat field when real observations are scarce. It introduces the split variational recurrent neural network (S-VRNN), which fuses small real data with a dynamics-derived noiseless support by splitting the latent space into two subspaces, and , to separate data-specific noise from dynamics-driven structure. Empirical results show that S-VRNN yields synthetic samples whose distribution closely matches the real data, outperforming both a purely data-driven VRNN and a split-VAE, particularly in low-data regimes, and reflecting the prescribed dynamics via a Hurwitz matrix . This dynamics-aware approach reduces the reality gap for synthetic data, facilitating faster validation, planning, and digital twin development in engineering contexts with limited observations.

Abstract

We study the problem of synthetic generation of samples of environmental features for autonomous vehicle navigation. These features are described by a spatiotemporally varying scalar field that we refer to as a threat field. The threat field is known to have some underlying dynamics subject to process noise. Some "real-world" data of observations of various threat fields are also available. The assumption is that the volume of ``real-world'' data is relatively small. The objective is to synthesize samples that are statistically similar to the data. The proposed solution is a generative artificial intelligence model that we refer to as a split variational recurrent neural network (S-VRNN). The S-VRNN merges the capabilities of a variational autoencoder, which is a widely used generative model, and a recurrent neural network, which is used to learn temporal dependencies in data. The main innovation in this work is that we split the latent space of the S-VRNN into two subspaces. The latent variables in one subspace are learned using the ``real-world'' data, whereas those in the other subspace are learned using the data as well as the known underlying system dynamics. Through numerical experiments we demonstrate that the proposed S-VRNN can synthesize data that are statistically similar to the training data even in the case of very small volume of ``real-world'' training data.

Paper Structure

This paper contains 11 sections, 12 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Sample threat field representing a single data point.
  • Figure 2: Illustration of S-VRNN training data. The S-VRNN architecture exploits the idea that, by definition, the support dataset (red dots) lies in a sub-manifold of the manifold formed by the dataset $\mathcal{X}$ (gray dots).
  • Figure 3: Visualization of the training- and generated data for $N_\mathrm{D} = 50.$
  • Figure 4: Visualization of the training- and generated data for $N_\mathrm{D} = 25.$
  • Figure 5: S-VAE generated samples for $N_\mathrm{D} = 25.$
  • ...and 2 more figures