Table of Contents
Fetching ...

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

Xi Cheng, Jinghao Zhang, Yunan Zeng, Wenfang Xue

TL;DR

This work tackles distribution shifts in market patterns for algorithmic trading by introducing MOT, a reinforcement learning framework that combines multiple actors with disentangled representations, an OT-based sample allocation mechanism, and a Pretrain Module for imitation learning. The method initializes from expert behavior, employs PPO-based imitation learning, and uses an Allocation Module regulated by Optimal Transport to model diverse market regimes. Empirical results on minute-level CSI 300 futures data show MOT achieves superior profitability with strong risk control, with ablations indicating that OT contributes the most to gains and two actors best capture market patterns. Overall, MOT demonstrates that integrating pattern-aware representation, imitation-informed initialization, and principled sample routing can substantially improve RL-based trading under non-stationary conditions.

Abstract

Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT,which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT.

MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

TL;DR

This work tackles distribution shifts in market patterns for algorithmic trading by introducing MOT, a reinforcement learning framework that combines multiple actors with disentangled representations, an OT-based sample allocation mechanism, and a Pretrain Module for imitation learning. The method initializes from expert behavior, employs PPO-based imitation learning, and uses an Allocation Module regulated by Optimal Transport to model diverse market regimes. Empirical results on minute-level CSI 300 futures data show MOT achieves superior profitability with strong risk control, with ablations indicating that OT contributes the most to gains and two actors best capture market patterns. Overall, MOT demonstrates that integrating pattern-aware representation, imitation-informed initialization, and principled sample routing can substantially improve RL-based trading under non-stationary conditions.

Abstract

Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT,which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks. Ablation studies validate the effectiveness of the components of MOT.
Paper Structure (15 sections, 7 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 7 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Profit of strategies in different market conditions. A bull market is suitable for momentum trading, while a volatile market is suitable for mean reversion trading.
  • Figure 2: The architecture of MOT. First, we pretrain the actor using the expert strategy and then proceed with imitation learning. We model different market patterns using multiple actors and allocate samples to the actors using the Allocation Module.
  • Figure 3: OT refers to assigning $x$ to the actor with the minimum $L_{err}^{ij}$ while achieving a balanced allocation proportion, $\frac{x\ to\ Actor\ 1}{x\ to\ Actor\ 2}\approx\frac{w_1}{w_2}$. The pink circles represent $L_{err}^{ij}$.
  • Figure 4: Performance of different models in terms of ARR
  • Figure 5: Effectiveness of OT modeling