Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies
Zheli Xiong
TL;DR
The paper addresses risk management in trading by fusing ensemble reinforcement learning with classifier-based decisions. It introduces a variance-aware ensemble framework that aggregates confidence across multiple RL agents and classifiers, applying adaptive decision rules governed by a variance threshold $\tau$. Empirical results in a FinRL setting show that ensemble methods consistently improve risk-adjusted performance (higher Cumulative Returns and Sharpe/Calmar metrics) and reduce drawdowns, though their effectiveness is highly sensitive to $\tau$ and benefits from dynamic adjustment. The approach offers a robust, adaptive strategy with potential applications beyond finance to robotics and other dynamic decision-making domains.
Abstract
This paper presents a comprehensive study on the use of ensemble Reinforcement Learning (RL) models in financial trading strategies, leveraging classifier models to enhance performance. By combining RL algorithms such as A2C, PPO, and SAC with traditional classifiers like Support Vector Machines (SVM), Decision Trees, and Logistic Regression, we investigate how different classifier groups can be integrated to improve risk-return trade-offs. The study evaluates the effectiveness of various ensemble methods, comparing them with individual RL models across key financial metrics, including Cumulative Returns, Sharpe Ratios (SR), Calmar Ratios, and Maximum Drawdown (MDD). Our results demonstrate that ensemble methods consistently outperform base models in terms of risk-adjusted returns, providing better management of drawdowns and overall stability. However, we identify the sensitivity of ensemble performance to the choice of variance threshold τ, highlighting the importance of dynamic τ adjustment to achieve optimal performance. This study emphasizes the value of combining RL with classifiers for adaptive decision-making, with implications for financial trading, robotics, and other dynamic environments.
