Table of Contents
Fetching ...

MIGA: Mixture-of-Experts with Group Aggregation for Stock Market Prediction

Zhaojian Yu, Yinghao Wu, Genesis Wang, Heming Weng

TL;DR

This paper presents MIGA, a novel Mixture of Expert with Group Aggregation framework designed to generate specialized predictions for stocks with different styles by dynamically switching between distinct style experts and proposes a novel inner group attention architecture.

Abstract

Stock market prediction has remained an extremely challenging problem for many decades owing to its inherent high volatility and low information noisy ratio. Existing solutions based on machine learning or deep learning demonstrate superior performance by employing a single model trained on the entire stock dataset to generate predictions across all types of stocks. However, due to the significant variations in stock styles and market trends, a single end-to-end model struggles to fully capture the differences in these stylized stock features, leading to relatively inaccurate predictions for all types of stocks. In this paper, we present MIGA, a novel Mixture of Expert with Group Aggregation framework designed to generate specialized predictions for stocks with different styles by dynamically switching between distinct style experts. To promote collaboration among different experts in MIGA, we propose a novel inner group attention architecture, enabling experts within the same group to share information and thereby enhancing the overall performance of all experts. As a result, MIGA significantly outperforms other end-to-end models on three Chinese Stock Index benchmarks including CSI300, CSI500, and CSI1000. Notably, MIGA-Conv reaches 24 % excess annual return on CSI300 benchmark, surpassing the previous state-of-the-art model by 8% absolute. Furthermore, we conduct a comprehensive analysis of mixture of experts for stock market prediction, providing valuable insights for future research.

MIGA: Mixture-of-Experts with Group Aggregation for Stock Market Prediction

TL;DR

This paper presents MIGA, a novel Mixture of Expert with Group Aggregation framework designed to generate specialized predictions for stocks with different styles by dynamically switching between distinct style experts and proposes a novel inner group attention architecture.

Abstract

Stock market prediction has remained an extremely challenging problem for many decades owing to its inherent high volatility and low information noisy ratio. Existing solutions based on machine learning or deep learning demonstrate superior performance by employing a single model trained on the entire stock dataset to generate predictions across all types of stocks. However, due to the significant variations in stock styles and market trends, a single end-to-end model struggles to fully capture the differences in these stylized stock features, leading to relatively inaccurate predictions for all types of stocks. In this paper, we present MIGA, a novel Mixture of Expert with Group Aggregation framework designed to generate specialized predictions for stocks with different styles by dynamically switching between distinct style experts. To promote collaboration among different experts in MIGA, we propose a novel inner group attention architecture, enabling experts within the same group to share information and thereby enhancing the overall performance of all experts. As a result, MIGA significantly outperforms other end-to-end models on three Chinese Stock Index benchmarks including CSI300, CSI500, and CSI1000. Notably, MIGA-Conv reaches 24 % excess annual return on CSI300 benchmark, surpassing the previous state-of-the-art model by 8% absolute. Furthermore, we conduct a comprehensive analysis of mixture of experts for stock market prediction, providing valuable insights for future research.
Paper Structure (18 sections, 13 equations, 6 figures, 3 tables)

This paper contains 18 sections, 13 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Comparison between MIGA and single end-to-end model. MIGA generates specialized predictions for stocks with different styles by dynamically switching between distinct style experts.
  • Figure 2: Overview of Mixture of Expert with Group Aggregation. nner group attention refers to the self-attention mechanism within each expert group, which facilitates the aggregation of information among experts within the same group.
  • Figure 3: The comparison of the training loss and validation set IC between MIGA and single end-to-end models during the first 8 epochs, where the optimal performance of most models emerges.
  • Figure 4: Comparison about different number of experts. The optimal setting of 8 out of 63 experts (7 groups of 9 experts each) achieve the best result on 8/12 of the metrics.
  • Figure 5: Stock specialization (top) and portfolio specialization (bottom) of MIGA-Conv.
  • ...and 1 more figures