Table of Contents
Fetching ...

ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation

Wei Shao, Rongyi Zhu, Cai Yang, Chandra Thapa, Muhammad Ejaz Ahmed, Seyit Camtepe, Rui Zhang, DuYong Kim, Hamid Menouar, Flora D. Salim

TL;DR

ST-DPGAN introduces a privacy-preserving framework for spatiotemporal data generation by integrating differential privacy with a Graph-GAN. The generator uses a transConv1d module to map 1-D Gaussian noise to a $T \times N$ spatiotemporal representation, while the discriminator employs spatial and temporal attention over a graph-embedded input, with DP-SGD enforcing privacy guarantees. Across three real-world datasets, ST-DPGAN and its Attn variant achieve superior data quality (lower MSE/MAE) than baselines like DPGAN and WGAN under varying privacy budgets, with ablation studies confirming the critical roles of transConv1d and graph embedding. The work demonstrates that privacy-protected synthetic spatiotemporal data can retain substantial utility for downstream predictive tasks, enabling safer data sharing and analysis in sensitive domains.

Abstract

Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.

ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation

TL;DR

ST-DPGAN introduces a privacy-preserving framework for spatiotemporal data generation by integrating differential privacy with a Graph-GAN. The generator uses a transConv1d module to map 1-D Gaussian noise to a spatiotemporal representation, while the discriminator employs spatial and temporal attention over a graph-embedded input, with DP-SGD enforcing privacy guarantees. Across three real-world datasets, ST-DPGAN and its Attn variant achieve superior data quality (lower MSE/MAE) than baselines like DPGAN and WGAN under varying privacy budgets, with ablation studies confirming the critical roles of transConv1d and graph embedding. The work demonstrates that privacy-protected synthetic spatiotemporal data can retain substantial utility for downstream predictive tasks, enabling safer data sharing and analysis in sensitive domains.

Abstract

Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
Paper Structure (15 sections, 2 theorems, 15 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 2 theorems, 15 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Let the initial random data be $X$ of size $1 \times N$, where each entry is $x_{j}$, and let the kernel be $K$ of size $s\times s$, where each entry is $k_{ij}$. Assume $x_{j}\stackrel{i.i.d}{\sim} \mathcal{N}(\mu_1,\sigma_1^2)$, $k_{ij}\stackrel{i.i.d}{\sim} \mathcal{N}(\mu_2,\sigma_2^2)$, and $x_ where $Q_1,Q_2\sim \chi_m^2$, $G\sim \mathcal{N}(0,\frac{m^2(\sigma_1^2+\sigma_2^2)(\mu_1^2+\mu_2^2

Figures (2)

  • Figure 1: A hostile individual can extract sensitive information from original spatiotemporal data. Our proposed method enables the generation of privacy-protected data while maintaining good data quality. Our method serves as a fundamental study for applying spatiotemporal analysis techniques on a large scale.
  • Figure 2: Architecture of ST-DPGAN.

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • Lemma 2