Generating Synthetic Time Series Data for Cyber-Physical Systems
Alexander Sommers, Somayeh Bakhtiari Ramezani, Logan Cummins, Sudip Mittal, Shahram Rahimi, Maria Seale, Joseph Jaboure
TL;DR
The paper addresses data augmentation for time-series in cyber-physical systems by proposing a pure transformer-based synthesizer trained as a conditional GAN and evaluated with the Wasserstein-Fourier Distance on normalized power spectral densities. It introduces a hierarchical transformer architecture with frequency-aware conditioning and a dual-head critic, and tests on the FEMTO/PRONOSTIA bearing dataset as well as a synthetic dataset. Results show the approach underperforms on real bearing data, revealing spectral misalignment despite plausible visuals, and similarly high divergence on synthetic data, highlighting the challenge of preserving long-range dependencies and spectral structure. The work motivates exploring diffusion-based methods and frequency-domain conditioning to improve synthetic TS quality for practical CPS data augmentation.
Abstract
Data augmentation is an important facilitator of deep learning applications in the time series domain. A gap is identified in the literature, demonstrating sparse exploration of the transformer, the dominant sequence model, for data augmentation in time series. A architecture hybridizing several successful priors is put forth and tested using a powerful time domain similarity metric. Results suggest the challenge of this domain, and several valuable directions for future work.
