Table of Contents
Fetching ...

MTS: A Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling

Fengchen Gu, Zhengyong Jiang, Ángel F. García-Fernández, Angelos Stefanidis, Jionglong Su, Huakang Li

TL;DR

The paper tackles adaptive portfolio management under dynamic risk and market timing by introducing MTS, a deep reinforcement learning framework that combines time-aware encoding, Incremental CVaR (ICVaR) risk control, and a parallel short-selling framework. It formulates the problem as a Markov decision process with a time-aware embedding and a short-selling control module to exploit market trends while containing tail risk. Empirical results on five DJIA datasets (2019–2023) show MTS achieving superior cumulative returns and risk-adjusted performance (e.g., average cumulative return up 30.67% and Sharpe up 29.33% relative to the next-best method), including strong long-horizon performance and favorable Sortino and Omega ratios. The work demonstrates the practical potential of integrating time-aware attention, dynamic tail-risk measures, and parallel strategy components for robust, real-time portfolio management in volatile markets.

Abstract

Portfolio management remains a crucial challenge in finance, with traditional methods often falling short in complex and volatile market environments. While deep reinforcement approaches have shown promise, they still face limitations in dynamic risk management, exploitation of temporal markets, and incorporation of complex trading strategies such as short-selling. These limitations can lead to suboptimal portfolio performance, increased vulnerability to market volatility, and missed opportunities in capturing potential returns from diverse market conditions. This paper introduces a Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling (MTS), offering a robust and adaptive strategy for sustainable investment performance. This framework utilizes a novel encoder-attention mechanism to address the limitations by incorporating temporal market characteristics, a parallel strategy for automated short-selling based on market trends, and risk management through innovative Incremental Conditional Value at Risk, enhancing adaptability and performance. Experimental validation on five diverse datasets from 2019 to 2023 demonstrates MTS's superiority over traditional algorithms and advanced machine learning techniques. MTS consistently achieves higher cumulative returns, Sharpe, Omega, and Sortino ratios, underscoring its effectiveness in balancing risk and return while adapting to market dynamics. MTS demonstrates an average relative increase of 30.67% in cumulative returns and 29.33% in Sharpe ratio compared to the next best-performing strategies across various datasets.

MTS: A Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling

TL;DR

The paper tackles adaptive portfolio management under dynamic risk and market timing by introducing MTS, a deep reinforcement learning framework that combines time-aware encoding, Incremental CVaR (ICVaR) risk control, and a parallel short-selling framework. It formulates the problem as a Markov decision process with a time-aware embedding and a short-selling control module to exploit market trends while containing tail risk. Empirical results on five DJIA datasets (2019–2023) show MTS achieving superior cumulative returns and risk-adjusted performance (e.g., average cumulative return up 30.67% and Sharpe up 29.33% relative to the next-best method), including strong long-horizon performance and favorable Sortino and Omega ratios. The work demonstrates the practical potential of integrating time-aware attention, dynamic tail-risk measures, and parallel strategy components for robust, real-time portfolio management in volatile markets.

Abstract

Portfolio management remains a crucial challenge in finance, with traditional methods often falling short in complex and volatile market environments. While deep reinforcement approaches have shown promise, they still face limitations in dynamic risk management, exploitation of temporal markets, and incorporation of complex trading strategies such as short-selling. These limitations can lead to suboptimal portfolio performance, increased vulnerability to market volatility, and missed opportunities in capturing potential returns from diverse market conditions. This paper introduces a Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling (MTS), offering a robust and adaptive strategy for sustainable investment performance. This framework utilizes a novel encoder-attention mechanism to address the limitations by incorporating temporal market characteristics, a parallel strategy for automated short-selling based on market trends, and risk management through innovative Incremental Conditional Value at Risk, enhancing adaptability and performance. Experimental validation on five diverse datasets from 2019 to 2023 demonstrates MTS's superiority over traditional algorithms and advanced machine learning techniques. MTS consistently achieves higher cumulative returns, Sharpe, Omega, and Sortino ratios, underscoring its effectiveness in balancing risk and return while adapting to market dynamics. MTS demonstrates an average relative increase of 30.67% in cumulative returns and 29.33% in Sharpe ratio compared to the next best-performing strategies across various datasets.

Paper Structure

This paper contains 17 sections, 19 equations, 11 figures, 2 tables, 4 algorithms.

Figures (11)

  • Figure 1: Interaction Process for Time Series. $V_t$ is the closing price vector in period $t$. $P_t$ is the portfolio value at the end of period $t$, and $P_{t+1}'$ is the portfolio value after it trades.
  • Figure 2: The Markov decision process of the DRL Environment, reflects the interaction of state, reward, action, DRL agents and the stock market environment.
  • Figure 3: The proposed MTS framework for portfolio management. The first part is data input and preprocessing, which involves separating the time features and feeding them along with the stock data into the neural network. The second part involves the training of neural networks and reinforcement learning, where the core of the neural network is Time-Aware Embedding and Attention. It outputs actions $a_t$ for portfolio management and also receives outputs from the environment. The third part is the portfolio management environment, which consists of two main components. The first component is risk control that incorporates ICVaR. The second component is a stock trading mechanism that allows short selling and utilizes parallel strategies.
  • Figure 4: Results of the comparative Experiments (Dataset 1)
  • Figure 5: Results of the comparative Experiments (Dataset 2)
  • ...and 6 more figures