AVATAR: Adversarial Autoencoders with Autoregressive Refinement for Time Series Generation
MohammadReza EskandariNasab, Shah Muhammad Hamdi, Soukaina Filali Boubrahimi
TL;DR
AVATAR addresses the challenge of generating realistic time series by combining Adversarial Autoencoders with autoregressive refinement through a supervisor network and a joint training regime. It introduces a distribution loss to align the aggregated latent posterior with a Gaussian prior and uses a regularized GRU architecture to improve generalization. Empirical results across diverse real-world and synthetic datasets show AVATAR consistently surpasses TimeGAN and other baselines in resemblance and predictive fidelity, while maintaining stability. The framework enables more reliable data augmentation for time series tasks and suggests potential applications in missing-value imputation and privacy-preserving data synthesis.
Abstract
Data augmentation can significantly enhance the performance of machine learning tasks by addressing data scarcity and improving generalization. However, generating time series data presents unique challenges. A model must not only learn a probability distribution that reflects the real data distribution but also capture the conditional distribution at each time step to preserve the inherent temporal dependencies. To address these challenges, we introduce AVATAR, a framework that combines Adversarial Autoencoders (AAE) with Autoregressive Learning to achieve both objectives. Specifically, our technique integrates the autoencoder with a supervisor and introduces a novel supervised loss to assist the decoder in learning the temporal dynamics of time series data. Additionally, we propose another innovative loss function, termed distribution loss, to guide the encoder in more efficiently aligning the aggregated posterior of the autoencoder's latent representation with a prior Gaussian distribution. Furthermore, our framework employs a joint training mechanism to simultaneously train all networks using a combined loss, thereby fulfilling the dual objectives of time series generation. We evaluate our technique across a variety of time series datasets with diverse characteristics. Our experiments demonstrate significant improvements in both the quality and practical utility of the generated data, as assessed by various qualitative and quantitative metrics.
