Table of Contents
Fetching ...

Financial time series augmentation using transformer based GAN architecture

Andrzej Podobiński, Jarosław A. Chudziak

TL;DR

This work shows that training a Long Short-Term Memory forecasting model on a dataset augmented with synthetic data generated by a transformer-based GAN (TTS-GAN) significantly improves the forecasting accuracy compared to using real data alone, and proposes a novel, time series specific quality metric.

Abstract

Time-series forecasting is a critical task across many domains, from engineering to economics, where accurate predictions drive strategic decisions. However, applying advanced deep learning models in challenging, volatile domains like finance is difficult due to the inherent limitation and dynamic nature of financial time series data. This scarcity often results in sub-optimal model training and poor generalization. The fundamental challenge lies in determining how to reliably augment scarce financial time series data to enhance the predictive accuracy of deep learning forecasting models. Our main contribution is a demonstration of how Generative Adversarial Networks (GANs) can effectively serve as a data augmentation tool to overcome data scarcity in the financial domain. Specifically, we show that training a Long Short-Term Memory (LSTM) forecasting model on a dataset augmented with synthetic data generated by a transformer-based GAN (TTS-GAN) significantly improves the forecasting accuracy compared to using real data alone. We confirm these results across different financial time series (Bitcoin and S\&P500 price data) and various forecasting horizons. Furthermore, we propose a novel, time series specific quality metric that combines Dynamic Time Warping (DTW) and a modified Deep Dataset Dissimilarity Measure (DeD-iMs) to reliably monitor the training progress and evaluate the quality of the generated data. These findings provide compelling evidence for the benefits of GAN-based data augmentation in enhancing financial predictive capabilities.

Financial time series augmentation using transformer based GAN architecture

TL;DR

This work shows that training a Long Short-Term Memory forecasting model on a dataset augmented with synthetic data generated by a transformer-based GAN (TTS-GAN) significantly improves the forecasting accuracy compared to using real data alone, and proposes a novel, time series specific quality metric.

Abstract

Time-series forecasting is a critical task across many domains, from engineering to economics, where accurate predictions drive strategic decisions. However, applying advanced deep learning models in challenging, volatile domains like finance is difficult due to the inherent limitation and dynamic nature of financial time series data. This scarcity often results in sub-optimal model training and poor generalization. The fundamental challenge lies in determining how to reliably augment scarce financial time series data to enhance the predictive accuracy of deep learning forecasting models. Our main contribution is a demonstration of how Generative Adversarial Networks (GANs) can effectively serve as a data augmentation tool to overcome data scarcity in the financial domain. Specifically, we show that training a Long Short-Term Memory (LSTM) forecasting model on a dataset augmented with synthetic data generated by a transformer-based GAN (TTS-GAN) significantly improves the forecasting accuracy compared to using real data alone. We confirm these results across different financial time series (Bitcoin and S\&P500 price data) and various forecasting horizons. Furthermore, we propose a novel, time series specific quality metric that combines Dynamic Time Warping (DTW) and a modified Deep Dataset Dissimilarity Measure (DeD-iMs) to reliably monitor the training progress and evaluate the quality of the generated data. These findings provide compelling evidence for the benefits of GAN-based data augmentation in enhancing financial predictive capabilities.
Paper Structure (14 sections, 4 equations, 3 figures, 1 table)

This paper contains 14 sections, 4 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Bitcoin price and volatility before and after US president election 2024
  • Figure 2: TTS-GAN architectureli2022tts
  • Figure 3: Comparison between TTS-GAN convergence measured using Wasserstein distance and DTW DeDiMs metric