Table of Contents
Fetching ...

Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution

Zijie Zhao, Roy E. Welsch

TL;DR

The Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework, achieves a positive and higher Sharpe ratio and underscores the efficacy of incorporating hierarchical structures into DRL strategies.

Abstract

Leveraging Deep Reinforcement Learning (DRL) in automated stock trading has shown promising results, yet its application faces significant challenges, including the curse of dimensionality, inertia in trading actions, and insufficient portfolio diversification. Addressing these challenges, we introduce the Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework. The HRT integrates a Proximal Policy Optimization (PPO)-based High-Level Controller (HLC) for strategic stock selection with a Deep Deterministic Policy Gradient (DDPG)-based Low-Level Controller (LLC) tasked with optimizing trade executions to enhance portfolio value. In our empirical analysis, comparing the HRT agent with standalone DRL models and the S&P 500 benchmark during both bullish and bearish market conditions, we achieve a positive and higher Sharpe ratio. This advancement not only underscores the efficacy of incorporating hierarchical structures into DRL strategies but also mitigates the aforementioned challenges, paving the way for designing more profitable and robust trading algorithms in complex markets.

Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution

TL;DR

The Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework, achieves a positive and higher Sharpe ratio and underscores the efficacy of incorporating hierarchical structures into DRL strategies.

Abstract

Leveraging Deep Reinforcement Learning (DRL) in automated stock trading has shown promising results, yet its application faces significant challenges, including the curse of dimensionality, inertia in trading actions, and insufficient portfolio diversification. Addressing these challenges, we introduce the Hierarchical Reinforced Trader (HRT), a novel trading strategy employing a bi-level Hierarchical Reinforcement Learning framework. The HRT integrates a Proximal Policy Optimization (PPO)-based High-Level Controller (HLC) for strategic stock selection with a Deep Deterministic Policy Gradient (DDPG)-based Low-Level Controller (LLC) tasked with optimizing trade executions to enhance portfolio value. In our empirical analysis, comparing the HRT agent with standalone DRL models and the S&P 500 benchmark during both bullish and bearish market conditions, we achieve a positive and higher Sharpe ratio. This advancement not only underscores the efficacy of incorporating hierarchical structures into DRL strategies but also mitigates the aforementioned challenges, paving the way for designing more profitable and robust trading algorithms in complex markets.

Paper Structure

This paper contains 17 sections, 5 equations, 4 figures, 1 table, 3 algorithms.

Figures (4)

  • Figure 1: Trading operations heatmap on DJIA 30 stocks for 2021 and 2022. The heatmaps depict the log values of trading operations, with a manually set trading threshold of $h_{max} = 100$. Each subfigure corresponds to a different year and trading strategy.
  • Figure 2: Overview of the Hierarchical Reinforced Trader (HRT) architecture. Interactions between the HLC and LLC are indicated by the red arrows.
  • Figure 3: Cumulative return curves of different investment strategies and S&P 500. Values are computed as the mean of ten independent training experiments, each with a different random seed.
  • Figure 4: Comparison of Trading Volume Proportions: DDPG versus HRT. Values are computed as the mean of ten independent experiments, each with a different random seed.