Reinforcement Learning Pair Trading: A Dynamic Scaling approach

Hongshen Yang; Avinash Malik

Reinforcement Learning Pair Trading: A Dynamic Scaling approach

Hongshen Yang, Avinash Malik

TL;DR

The paper tackles the challenge of profitable, fast, and adaptive trading in highly volatile cryptocurrency markets by integrating Reinforcement Learning (RL) with pair trading. It introduces an RL-based dynamic scaling framework that enables the agent to decide not only when to trade but also how much capital to allocate (investment quantity), via two agents that handle timing/direction (RL$_1$) and timing/quantity (RL$_2$). Key contributions include an RL environment tailored for quantity-varying pair trading, reward shaping, observation/action spaces, a grid-search protocol for hyperparameters, and empirical results showing substantial profitability gains over traditional pair trading. The findings demonstrate that RL-based pair trading can outperform static rule-based approaches in crypto markets, with significance for designing fast, flexible arbitrage systems in practice.

Abstract

Cryptocurrency is a cryptography-based digital asset with extremely volatile prices. Around USD 70 billion worth of cryptocurrency is traded daily on exchanges. Trading cryptocurrency is difficult due to the inherent volatility of the crypto market. This study investigates whether Reinforcement Learning (RL) can enhance decision-making in cryptocurrency algorithmic trading compared to traditional methods. In order to address this question, we combined reinforcement learning with a statistical arbitrage trading technique, pair trading, which exploits the price difference between statistically correlated assets. We constructed RL environments and trained RL agents to determine when and how to trade pairs of cryptocurrencies. We developed new reward shaping and observation/action spaces for reinforcement learning. We performed experiments with the developed reinforcement learner on pairs of BTC-GBP and BTC-EUR data separated by 1 min intervals (n=263,520). The traditional non-RL pair trading technique achieved an annualized profit of 8.33%, while the proposed RL-based pair trading technique achieved annualized profits from 9.94% to 31.53%, depending upon the RL learner. Our results show that RL can significantly outperform manual and traditional pair trading techniques when applied to volatile markets such as~cryptocurrencies.

Reinforcement Learning Pair Trading: A Dynamic Scaling approach

TL;DR

) and timing/quantity (RL

). Key contributions include an RL environment tailored for quantity-varying pair trading, reward shaping, observation/action spaces, a grid-search protocol for hyperparameters, and empirical results showing substantial profitability gains over traditional pair trading. The findings demonstrate that RL-based pair trading can outperform static rule-based approaches in crypto markets, with significance for designing fast, flexible arbitrage systems in practice.

Abstract

Paper Structure (25 sections, 11 equations, 7 figures, 6 tables)

This paper contains 25 sections, 11 equations, 7 figures, 6 tables.

Introduction
Background
Traditional Pair Trading
rl
Related Work
rl in Algorithmic Trading
rl in Pair Trading
Methodology
Pair Formation
Spread Calculation
Parameter Selection
Reinforcement Learning Pair Trading
Observation Space
Action Space
Reward Shaping
...and 10 more sections

Figures (7)

Figure S1: Stretched pair trading view of price distance between $p_i$ and $p_j$. Figure (b), which shares the same time axis with (a), is a stretched view of (a). It presents the corresponding same actions with the crossing of Spread (S) and zones in two different views.
Figure S2: Architecture of trading strategies.
Figure S3: Window-size cut for correlation and co-integration testing.
Figure S4: The value of position observation based on investment.
Figure S5: Prices of BTCEUR and BTCGBP.
...and 2 more figures

Reinforcement Learning Pair Trading: A Dynamic Scaling approach

TL;DR

Abstract

Reinforcement Learning Pair Trading: A Dynamic Scaling approach

Authors

TL;DR

Abstract

Table of Contents

Figures (7)