Table of Contents
Fetching ...

Statistical Arbitrage in Polish Equities Market Using Deep Learning Techniques

Marek Adamczyk, Michał Dąbrowski

TL;DR

This paper adapts Statistical Arbitrage, specifically Pairs Trading, to the Polish equity market by replacing the second asset with a replication built from risk-factor representations. It evaluates three replication methods—PCA-derived eigenportfolios, LSTM-based weightings, and ETFs (real and artificial) as proxies for market factors—within the Avellaneda–Lee framework, using 60 major Polish stocks (WIG20 and mWIG40). In backtests of 2017–2019, PCA and LSTM achieved profits with high Sharpe ratios (up to 2.63 for PCA), while ETF-based approaches yielded steadier but smaller gains (~5%). During the 2020 COVID-19 recession, ETF-based strategies remained profitable, whereas PCA and LSTM underperformed or were negative, though LSTM results point to potential with further optimization. Overall, the work demonstrates the viability of deep-learning-based replication and ETF proxies for statistical arbitrage in a developed but smaller market and provides a roadmap for refining LSTM-based replication in future work.

Abstract

We study a systematic approach to a popular Statistical Arbitrage technique: Pairs Trading. Instead of relying on two highly correlated assets, we replace the second asset with a replication of the first using risk factor representations. These factors are obtained through Principal Components Analysis (PCA), exchange traded funds (ETFs), and, as our main contribution, Long Short Term Memory networks (LSTMs). Residuals between the main asset and its replication are examined for mean reversion properties, and trading signals are generated for sufficiently fast mean reverting portfolios. Beyond introducing a deep learning based replication method, we adapt the framework of Avellaneda and Lee (2008) to the Polish market. Accordingly, components of WIG20, mWIG40, and selected sector indices replace the original S&P500 universe, and market parameters such as the risk free rate and transaction costs are updated to reflect local conditions. We outline the full strategy pipeline: risk factor construction, residual modeling via the Ornstein Uhlenbeck process, and signal generation. Each replication technique is described together with its practical implementation. Strategy performance is evaluated over two periods: 2017-2019 and the recessive year 2020. All methods yield profits in 2017-2019, with PCA achieving roughly 20 percent cumulative return and an annualized Sharpe ratio of up to 2.63. Despite multiple adaptations, our conclusions remain consistent with those of the original paper. During the COVID-19 recession, only the ETF based approach remains profitable (about 5 percent annual return), while PCA and LSTM methods underperform. LSTM results, although negative, are promising and indicate potential for future optimization.

Statistical Arbitrage in Polish Equities Market Using Deep Learning Techniques

TL;DR

This paper adapts Statistical Arbitrage, specifically Pairs Trading, to the Polish equity market by replacing the second asset with a replication built from risk-factor representations. It evaluates three replication methods—PCA-derived eigenportfolios, LSTM-based weightings, and ETFs (real and artificial) as proxies for market factors—within the Avellaneda–Lee framework, using 60 major Polish stocks (WIG20 and mWIG40). In backtests of 2017–2019, PCA and LSTM achieved profits with high Sharpe ratios (up to 2.63 for PCA), while ETF-based approaches yielded steadier but smaller gains (~5%). During the 2020 COVID-19 recession, ETF-based strategies remained profitable, whereas PCA and LSTM underperformed or were negative, though LSTM results point to potential with further optimization. Overall, the work demonstrates the viability of deep-learning-based replication and ETF proxies for statistical arbitrage in a developed but smaller market and provides a roadmap for refining LSTM-based replication in future work.

Abstract

We study a systematic approach to a popular Statistical Arbitrage technique: Pairs Trading. Instead of relying on two highly correlated assets, we replace the second asset with a replication of the first using risk factor representations. These factors are obtained through Principal Components Analysis (PCA), exchange traded funds (ETFs), and, as our main contribution, Long Short Term Memory networks (LSTMs). Residuals between the main asset and its replication are examined for mean reversion properties, and trading signals are generated for sufficiently fast mean reverting portfolios. Beyond introducing a deep learning based replication method, we adapt the framework of Avellaneda and Lee (2008) to the Polish market. Accordingly, components of WIG20, mWIG40, and selected sector indices replace the original S&P500 universe, and market parameters such as the risk free rate and transaction costs are updated to reflect local conditions. We outline the full strategy pipeline: risk factor construction, residual modeling via the Ornstein Uhlenbeck process, and signal generation. Each replication technique is described together with its practical implementation. Strategy performance is evaluated over two periods: 2017-2019 and the recessive year 2020. All methods yield profits in 2017-2019, with PCA achieving roughly 20 percent cumulative return and an annualized Sharpe ratio of up to 2.63. Despite multiple adaptations, our conclusions remain consistent with those of the original paper. During the COVID-19 recession, only the ETF based approach remains profitable (about 5 percent annual return), while PCA and LSTM methods underperform. LSTM results, although negative, are promising and indicate potential for future optimization.

Paper Structure

This paper contains 38 sections, 3 theorems, 70 equations, 52 figures, 13 tables.

Key Result

Lemma 2.1

For a deterministic function $f \in \mathcal{L}_2$ and $0\leq s < t$ where $\int_s^t (\cdot) dBu$ is Ito's integral.

Figures (52)

  • Figure 1: WIG daily close prices throughout years
  • Figure 2: WIG20 and Total Return WIG20TR daily close prices throughout 2022
  • Figure 3: WIG20 companies by share in index (as of July 2023)
  • Figure 4: WIG20 and its components' daily returns correlation matrix
  • Figure 5: WIG20 and its components' relative daily close prices
  • ...and 47 more figures

Theorems & Definitions (5)

  • Definition 2.1
  • Lemma 2.1
  • Definition 2.2
  • Lemma 3.1
  • Theorem 3.1: Universal approximation theorem