Revisiting Ensemble Methods for Stock Trading and Crypto Trading Tasks at ACM ICAIF FinRL Contest 2023-2024
Nikolaus Holzer, Keyi Wang, Kairong Xiao, Xiao-Yang Liu Yanglet
TL;DR
The paper tackles policy instability and sampling bottlenecks in financial reinforcement learning by combining ensemble methods with massively parallel GPU simulations. It introduces vectorized market environments and GPU-resident data storage to achieve up to $1{,}746\times$ sampling speedups on a single GPU and demonstrates that ensembles can yield robust, risk-aware performance in both stock and cryptocurrency trading. Key contributions include KL-divergence-driven agent diversity, two tailored ensemble designs (weighted-action for stocks and majority-vote for crypto), and comprehensive evaluations showing improved risk metrics such as reduced maximum drawdown and enhanced Sharpe ratios. The work has practical implications for scalable, robust FinRL training and proposes future directions like zero-knowledge proof validation for model provenance in trading contexts.
Abstract
Reinforcement learning has demonstrated great potential for performing financial tasks. However, it faces two major challenges: policy instability and sampling bottlenecks. In this paper, we revisit ensemble methods with massively parallel simulations on graphics processing units (GPUs), significantly enhancing the computational efficiency and robustness of trained models in volatile financial markets. Our approach leverages the parallel processing capability of GPUs to significantly improve the sampling speed for training ensemble models. The ensemble models combine the strengths of component agents to improve the robustness of financial decision-making strategies. We conduct experiments in both stock and cryptocurrency trading tasks to evaluate the effectiveness of our approach. Massively parallel simulation on a single GPU improves the sampling speed by up to $1,746\times$ using $2,048$ parallel environments compared to a single environment. The ensemble models have high cumulative returns and outperform some individual agents, reducing maximum drawdown by up to $4.17\%$ and improving the Sharpe ratio by up to $0.21$. This paper describes trading tasks at ACM ICAIF FinRL Contests in 2023 and 2024.
