Table of Contents
Fetching ...

Synthetic Trajectory Generation Through Convolutional Neural Networks

Jesse Merhi, Erik Buchholz, Salil S. Kanhere

TL;DR

The paper tackles the privacy-sensitive problem of publishing location trajectories by exploring synthetic data generation with a CNN-based approach. It introduces a Reversible Trajectory-to-CNN Transformation (RTCT) to convert trajectories into CNN-friendly inputs and integrates this with a DCGAN in a proof-of-concept, comparing against an RNN-based trajectory GAN under both non-private and DP-SGD regimes. Results show the CNN-based model better captures spatial distributions but lags in sequential and temporal fidelity, with DP training generally weakening performance; nonetheless, the work demonstrates a viable pathway to applying computer-vision generative models to trajectory data and provides open-source code to spur further research. The study highlights the trade-offs between privacy guarantees and utility, suggesting that more domain-specific encodings and tailored losses are needed to reach practical utility while maintaining formal privacy. Overall, the work establishes foundational steps toward privacy-preserving, CNN-based trajectory synthesis and invites targeted future improvements.

Abstract

Location trajectories provide valuable insights for applications from urban planning to pandemic control. However, mobility data can also reveal sensitive information about individuals, such as political opinions, religious beliefs, or sexual orientations. Existing privacy-preserving approaches for publishing this data face a significant utility-privacy trade-off. Releasing synthetic trajectory data generated through deep learning offers a promising solution. Due to the trajectories' sequential nature, most existing models are based on recurrent neural networks (RNNs). However, research in generative adversarial networks (GANs) largely employs convolutional neural networks (CNNs) for image generation. This discrepancy raises the question of whether advances in computer vision can be applied to trajectory generation. In this work, we introduce a Reversible Trajectory-to-CNN Transformation (RTCT) that adapts trajectories into a format suitable for CNN-based models. We integrated this transformation with the well-known DCGAN in a proof-of-concept (PoC) and evaluated its performance against an RNN-based trajectory GAN using four metrics across two datasets. The PoC was superior in capturing spatial distributions compared to the RNN model but had difficulty replicating sequential and temporal properties. Although the PoC's utility is not sufficient for practical applications, the results demonstrate the transformation's potential to facilitate the use of CNNs for trajectory generation, opening up avenues for future research. To support continued research, all source code has been made available under an open-source license.

Synthetic Trajectory Generation Through Convolutional Neural Networks

TL;DR

The paper tackles the privacy-sensitive problem of publishing location trajectories by exploring synthetic data generation with a CNN-based approach. It introduces a Reversible Trajectory-to-CNN Transformation (RTCT) to convert trajectories into CNN-friendly inputs and integrates this with a DCGAN in a proof-of-concept, comparing against an RNN-based trajectory GAN under both non-private and DP-SGD regimes. Results show the CNN-based model better captures spatial distributions but lags in sequential and temporal fidelity, with DP training generally weakening performance; nonetheless, the work demonstrates a viable pathway to applying computer-vision generative models to trajectory data and provides open-source code to spur further research. The study highlights the trade-offs between privacy guarantees and utility, suggesting that more domain-specific encodings and tailored losses are needed to reach practical utility while maintaining formal privacy. Overall, the work establishes foundational steps toward privacy-preserving, CNN-based trajectory synthesis and invites targeted future improvements.

Abstract

Location trajectories provide valuable insights for applications from urban planning to pandemic control. However, mobility data can also reveal sensitive information about individuals, such as political opinions, religious beliefs, or sexual orientations. Existing privacy-preserving approaches for publishing this data face a significant utility-privacy trade-off. Releasing synthetic trajectory data generated through deep learning offers a promising solution. Due to the trajectories' sequential nature, most existing models are based on recurrent neural networks (RNNs). However, research in generative adversarial networks (GANs) largely employs convolutional neural networks (CNNs) for image generation. This discrepancy raises the question of whether advances in computer vision can be applied to trajectory generation. In this work, we introduce a Reversible Trajectory-to-CNN Transformation (RTCT) that adapts trajectories into a format suitable for CNN-based models. We integrated this transformation with the well-known DCGAN in a proof-of-concept (PoC) and evaluated its performance against an RNN-based trajectory GAN using four metrics across two datasets. The PoC was superior in capturing spatial distributions compared to the RNN model but had difficulty replicating sequential and temporal properties. Although the PoC's utility is not sufficient for practical applications, the results demonstrate the transformation's potential to facilitate the use of CNNs for trajectory generation, opening up avenues for future research. To support continued research, all source code has been made available under an open-source license.
Paper Structure (36 sections, 7 equations, 6 figures)

This paper contains 36 sections, 7 equations, 6 figures.

Figures (6)

  • Figure 1: : A trajectory's latitude, longitude, day, and hour values are normalised into a $12\times12$ matrix and upscaled to $24\times24\times4$ for dcgan.
  • Figure 2: Reversion: generates a $24\times24\times4$ trajectory, which is downsampled to $12\times12\times4$ and denormalised to retrieve the original format.
  • Figure 3: On Geolife Geolife1, significantly outperforms for the spatial metrics, i.e., the model can capture the point distribution better. However, the -based excels at capturing the distance between consecutive points (). struggles to generate sensible timestamps ().
  • Figure 4: On the fs_nyc dataset, captures the spatial distribution better than , although the difference is less significant than on Geolife. Interestingly, performs the worst in regard to the on . The shows similar results to Geolife.
  • Figure 5: The real trajectory is dense and follows a primary direction. Generated trajectories are initially clustered in the map's centre but become more meaningful after training.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Differential Privacy