H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of Experts
Peilin Tan, Liang Xie, Churan Zhi, Dian Tu, Chuanqi Shi
TL;DR
The paper tackles stock movement prediction under multi-modal signals by integrating a hierarchical, hypergraph-based relational structure with LLM-driven semantic enrichment and a Style-Structured Mixture of Experts. It introduces Local and Global Context Hypergraphs to capture fine-grained and persistent market dynamics, respectively, and fuses quantitative signals with textual news via a frozen Llama-3.2-1B backbone for semantic reasoning. The approach achieves state-of-the-art predictive accuracy and investment performance across DJIA, NASDAQ 100, and S&P 100, demonstrating strong risk management through sparse, regime-aware expert routing and JSD-weighted hyperedges. This framework offers a scalable, end-to-end solution for multimodal financial forecasting that leverages structure, language, and style to handle regime shifts and cross-modal interactions in real markets.
Abstract
Stock movement prediction remains fundamentally challenging due to complex temporal dependencies, heterogeneous modalities, and dynamically evolving inter-stock relationships. Existing approaches often fail to unify structural, semantic, and regime-adaptive modeling within a scalable framework. This work introduces H3M-SSMoEs, a novel Hypergraph-based MultiModal architecture with LLM reasoning and Style-Structured Mixture of Experts, integrating three key innovations: (1) a Multi-Context Multimodal Hypergraph that hierarchically captures fine-grained spatiotemporal dynamics via a Local Context Hypergraph (LCH) and persistent inter-stock dependencies through a Global Context Hypergraph (GCH), employing shared cross-modal hyperedges and Jensen-Shannon Divergence weighting mechanism for adaptive relational learning and cross-modal alignment; (2) a LLM-enhanced reasoning module, which leverages a frozen large language model with lightweight adapters to semantically fuse and align quantitative and textual modalities, enriching representations with domain-specific financial knowledge; and (3) a Style-Structured Mixture of Experts (SSMoEs) that combines shared market experts and industry-specialized experts, each parameterized by learnable style vectors enabling regime-aware specialization under sparse activation. Extensive experiments on three major stock markets demonstrate that H3M-SSMoEs surpasses state-of-the-art methods in both superior predictive accuracy and investment performance, while exhibiting effective risk control. Datasets, source code, and model weights are available at our GitHub repository: https://github.com/PeilinTime/H3M-SSMoEs.
