Table of Contents
Fetching ...

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Jun Han, Shuo Zhang, Wei Li, Zhi Yang, Yifan Dong, Tu Hu, Jialuo Yuan, Xiaomin Yu, Yumo Zhu, Fangqi Lou, Xin Guo, Zhaowei Liu, Tianyi Jiang, Ruichuan An, Jingping Liu, Biao Wu, Rongze Chen, Kunyi Wang, Yifan Wang, Sen Hu, Xinbing Kong, Liwen Zhang, Ronghao Chen, Huacan Wang

TL;DR

QuantaAlpha addresses the challenges of noisy backtests and non-stationary market regimes by treating alpha mining as trajectory-driven evolution. It employs a four-component framework—diversified planning initialization, controllable factor construction with symbolic AST representations, trajectory-level mutation and crossover, and a final factor pool—to enable structured exploration, reliable reuse of validated patterns, and verifiable lineage. The approach enforces semantic consistency and constrains complexity and redundancy to reduce drift and factor crowding, yielding robust cross-market performance. Empirical results on CSI 300 and transfers to CSI 500 and the S&P 500 show consistent improvements in predictive power and strategy metrics, including high Information Coefficient and favorable ARR with manageable drawdown, demonstrating practical viability under distribution shifts.

Abstract

Financial markets are noisy and non-stationary, making alpha mining highly sensitive to noise in backtesting results and sudden market regime shifts. While recent agentic frameworks improve alpha mining automation, they often lack controllable multi-round search and reliable reuse of validated experience. To address these challenges, we propose QuantaAlpha, an evolutionary alpha mining framework that treats each end-to-end mining run as a trajectory and improves factors through trajectory-level mutation and crossover operations. QuantaAlpha localizes suboptimal steps in each trajectory for targeted revision and recombines complementary high-reward segments to reuse effective patterns, enabling structured exploration and refinement across mining iterations. During factor generation, QuantaAlpha enforces semantic consistency across the hypothesis, factor expression, and executable code, while constraining the complexity and redundancy of the generated factor to mitigate crowding. Extensive experiments on the China Securities Index 300 (CSI 300) demonstrate consistent gains over strong baseline models and prior agentic systems. When utilizing GPT-5.2, QuantaAlpha achieves an Information Coefficient (IC) of 0.1501, with an Annualized Rate of Return (ARR) of 27.75% and a Maximum Drawdown (MDD) of 7.98%. Moreover, factors mined on CSI 300 transfer effectively to the China Securities Index 500 (CSI 500) and the Standard & Poor's 500 Index (S&P 500), delivering 160% and 137% cumulative excess return over four years, respectively, which indicates strong robustness of QuantaAlpha under market distribution shifts.

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

TL;DR

QuantaAlpha addresses the challenges of noisy backtests and non-stationary market regimes by treating alpha mining as trajectory-driven evolution. It employs a four-component framework—diversified planning initialization, controllable factor construction with symbolic AST representations, trajectory-level mutation and crossover, and a final factor pool—to enable structured exploration, reliable reuse of validated patterns, and verifiable lineage. The approach enforces semantic consistency and constrains complexity and redundancy to reduce drift and factor crowding, yielding robust cross-market performance. Empirical results on CSI 300 and transfers to CSI 500 and the S&P 500 show consistent improvements in predictive power and strategy metrics, including high Information Coefficient and favorable ARR with manageable drawdown, demonstrating practical viability under distribution shifts.

Abstract

Financial markets are noisy and non-stationary, making alpha mining highly sensitive to noise in backtesting results and sudden market regime shifts. While recent agentic frameworks improve alpha mining automation, they often lack controllable multi-round search and reliable reuse of validated experience. To address these challenges, we propose QuantaAlpha, an evolutionary alpha mining framework that treats each end-to-end mining run as a trajectory and improves factors through trajectory-level mutation and crossover operations. QuantaAlpha localizes suboptimal steps in each trajectory for targeted revision and recombines complementary high-reward segments to reuse effective patterns, enabling structured exploration and refinement across mining iterations. During factor generation, QuantaAlpha enforces semantic consistency across the hypothesis, factor expression, and executable code, while constraining the complexity and redundancy of the generated factor to mitigate crowding. Extensive experiments on the China Securities Index 300 (CSI 300) demonstrate consistent gains over strong baseline models and prior agentic systems. When utilizing GPT-5.2, QuantaAlpha achieves an Information Coefficient (IC) of 0.1501, with an Annualized Rate of Return (ARR) of 27.75% and a Maximum Drawdown (MDD) of 7.98%. Moreover, factors mined on CSI 300 transfer effectively to the China Securities Index 500 (CSI 500) and the Standard & Poor's 500 Index (S&P 500), delivering 160% and 137% cumulative excess return over four years, respectively, which indicates strong robustness of QuantaAlpha under market distribution shifts.
Paper Structure (52 sections, 9 equations, 8 figures, 7 tables)

This paper contains 52 sections, 9 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Cumulative excess returns of different approaches on CSI 500 and S&P 500.
  • Figure 2: Comparison with existing methods: QuantaAlpha improves alpha discovery through trajectory-level self-evolution.
  • Figure 3: Overview of the QuantaAlpha framework. Our approach consists of four core components: (A) Diversified Planning Initialization to generate candidate hypotheses, (B) Factor Realization that iteratively instantiates hypotheses into executable factors with constraint gating, (C) Self-Evolution that applies mutation and crossover over evaluated trajectories, and (D) A Final Factor Pool that consolidates validated effective factors.
  • Figure 4: Ablation study of semantic consistency, complexity, and redundancy controls.
  • Figure 5: Annual IC and Rank IC comparison on the CSI 300.
  • ...and 3 more figures