Table of Contents
Fetching ...

LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading

Kuan-Ming Liu, Ming-Chih Lo

TL;DR

This work tackles the limitations of unimodal, static routing in Mixture-of-Experts models for stock trading by introducing LLMoE, which uses a Large Language Model as a dynamic router to fuse historical price data with stock news. The framework comprises an LLM-based router, context-specific optimistic/pessimistic experts, and an All-in All-out trading algorithm, enabling interpretable routing and robust predictions. Empirical results on MSFT and AAPL datasets demonstrate that LLMoE outperforms traditional baselines and a 2-expert MoE across multiple metrics, while also enabling human-like reasoning in its router. By integrating multimodal information and context-aware routing, LLMoE offers a scalable, adaptable approach for intelligent trading and potentially other multimodal financial tasks.

Abstract

Recent advances in deep learning and large language models (LLMs) have facilitated the deployment of the mixture-of-experts (MoE) mechanism in the stock investment domain. While these models have demonstrated promising trading performance, they are often unimodal, neglecting the wealth of information available in other modalities, such as textual data. Moreover, the traditional neural network-based router selection mechanism fails to consider contextual and real-world nuances, resulting in suboptimal expert selection. To address these limitations, we propose LLMoE, a novel framework that employs LLMs as the router within the MoE architecture. Specifically, we replace the conventional neural network-based router with LLMs, leveraging their extensive world knowledge and reasoning capabilities to select experts based on historical price data and stock news. This approach provides a more effective and interpretable selection mechanism. Our experiments on multimodal real-world stock datasets demonstrate that LLMoE outperforms state-of-the-art MoE models and other deep neural network approaches. Additionally, the flexible architecture of LLMoE allows for easy adaptation to various downstream tasks.

LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading

TL;DR

This work tackles the limitations of unimodal, static routing in Mixture-of-Experts models for stock trading by introducing LLMoE, which uses a Large Language Model as a dynamic router to fuse historical price data with stock news. The framework comprises an LLM-based router, context-specific optimistic/pessimistic experts, and an All-in All-out trading algorithm, enabling interpretable routing and robust predictions. Empirical results on MSFT and AAPL datasets demonstrate that LLMoE outperforms traditional baselines and a 2-expert MoE across multiple metrics, while also enabling human-like reasoning in its router. By integrating multimodal information and context-aware routing, LLMoE offers a scalable, adaptable approach for intelligent trading and potentially other multimodal financial tasks.

Abstract

Recent advances in deep learning and large language models (LLMs) have facilitated the deployment of the mixture-of-experts (MoE) mechanism in the stock investment domain. While these models have demonstrated promising trading performance, they are often unimodal, neglecting the wealth of information available in other modalities, such as textual data. Moreover, the traditional neural network-based router selection mechanism fails to consider contextual and real-world nuances, resulting in suboptimal expert selection. To address these limitations, we propose LLMoE, a novel framework that employs LLMs as the router within the MoE architecture. Specifically, we replace the conventional neural network-based router with LLMs, leveraging their extensive world knowledge and reasoning capabilities to select experts based on historical price data and stock news. This approach provides a more effective and interpretable selection mechanism. Our experiments on multimodal real-world stock datasets demonstrate that LLMoE outperforms state-of-the-art MoE models and other deep neural network approaches. Additionally, the flexible architecture of LLMoE allows for easy adaptation to various downstream tasks.
Paper Structure (35 sections, 4 equations, 2 figures, 3 tables)

This paper contains 35 sections, 4 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: An illustration comparing traditional single-model approaches, MoE frameworks, and LLMoE. Traditional models use a single predictor with numerical data, MoE adds multiple experts but uses static routing, while LLMoE integrates multimodal data with LLM-driven dynamic routing.
  • Figure 2: Overview of the LLMoE framework, illustrating its three stages: LLM-based router, expert prediction, and trading algorithm generation.