VFLGAN-TS: Vertical Federated Learning-based Generative Adversarial Networks for Publication of Vertically Partitioned Time-Series Data

Xun Yuan; Zilong Zhao; Prosanta Gope; Biplab Sikdar

VFLGAN-TS: Vertical Federated Learning-based Generative Adversarial Networks for Publication of Vertically Partitioned Time-Series Data

Xun Yuan, Zilong Zhao, Prosanta Gope, Biplab Sikdar

TL;DR

VFLGAN-TS, which combines the ideas of attribute discriminator and vertical federated learning to generate synthetic time-series data in the vertically partitioned scenario, is proposed and an enhanced privacy auditing scheme is developed to evaluate the potential privacy breach through the framework of VFLGAN-TS and synthetic datasets.

Abstract

In the current artificial intelligence (AI) era, the scale and quality of the dataset play a crucial role in training a high-quality AI model. However, often original data cannot be shared due to privacy concerns and regulations. A potential solution is to release a synthetic dataset with a similar distribution to the private dataset. Nevertheless, in some scenarios, the attributes required to train an AI model are distributed among different parties, and the parties cannot share the local data for synthetic data construction due to privacy regulations. In PETS 2024, we recently introduced the first Vertical Federated Learning-based Generative Adversarial Network (VFLGAN) for publishing vertically partitioned static data. However, VFLGAN cannot effectively handle time-series data, presenting both temporal and attribute dimensions. In this article, we proposed VFLGAN-TS, which combines the ideas of attribute discriminator and vertical federated learning to generate synthetic time-series data in the vertically partitioned scenario. The performance of VFLGAN-TS is close to that of its counterpart, which is trained in a centralized manner and represents the upper limit for VFLGAN-TS. To further protect privacy, we apply a Gaussian mechanism to make VFLGAN-TS satisfy an $(ε,δ)$-differential privacy. Besides, we develop an enhanced privacy auditing scheme to evaluate the potential privacy breach through the framework of VFLGAN-TS and synthetic datasets.

VFLGAN-TS: Vertical Federated Learning-based Generative Adversarial Networks for Publication of Vertically Partitioned Time-Series Data

TL;DR

Abstract

-differential privacy. Besides, we develop an enhanced privacy auditing scheme to evaluate the potential privacy breach through the framework of VFLGAN-TS and synthetic datasets.

Paper Structure (30 sections, 7 theorems, 31 equations, 7 figures, 6 tables, 3 algorithms)

This paper contains 30 sections, 7 theorems, 31 equations, 7 figures, 6 tables, 3 algorithms.

Introduction
Related Work
Publication of Vertically Partitioned Data
GANs for Time-Series Data
Differentially Private Mechanisms for GANs
Privacy Auditing Methods
Prelimineries
GANs for Time-Series Data
Auditing Scheme for Synthetic Datasets
Differential Privacy
Proposed VFLGAN-TS
Problem Formulation
Framework of VFLGAN-TS
Training Process of VFLGAN-TS
Differentially Private VFLGAN-TS
...and 15 more sections

Key Result

Proposition 1

(Composition of RDP) Let $f : D \rightarrow R_1$ be $(\alpha, \epsilon_1)$-RDP and $g : R_1 \times D \rightarrow R_2$ be $(\alpha,\epsilon_2)$-RDP. Then the mechanism defined as $(X, Y)$, where $X \sim f(D)$ and $Y \sim g(X, D)$, satisfies $(\alpha, \epsilon_1 + \epsilon_2)$-RDP.

Figures (7)

Figure 1: Framework of the Proposed VFLGAN-TS.
Figure 2: Wasserstein distance curves during training different methods on Sine Datasets.
Figure 3: The histograms in each sub-figure show the amplitude distribution of each attribute. The spots in each sub-figure represent a sample's amplitudes of both attributes.
Figure 4: Wasserstein distance curves during training different methods on EEG Dataset.
Figure 5: Visualization of similarity between real and synthetic datasets using PCA and t-SNE. The left two columns are results for EEG 0 and the right two columns are results for EEG 1.
...and 2 more figures

Theorems & Definitions (14)

Definition 1
Definition 2
Definition 3
Proposition 1
Proposition 2
Proposition 3
Theorem 1
proof
Proposition 4
proof
...and 4 more

VFLGAN-TS: Vertical Federated Learning-based Generative Adversarial Networks for Publication of Vertically Partitioned Time-Series Data

TL;DR

Abstract

VFLGAN-TS: Vertical Federated Learning-based Generative Adversarial Networks for Publication of Vertically Partitioned Time-Series Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (14)