Cognitive Alpha Mining via LLM-Driven Code-Based Evolution
Fengyuan Liu, Huang Yi, Sichun Luo, Yuqi Wang, Yazheng Yang, Xinye Li, Zefa Hu, Junlan Feng, Qi Liu
TL;DR
CogAlpha introduces Cognitive Alpha Mining, merging a code-based alpha representation with LLM-driven reasoning and evolutionary search to expand the alpha discovery space in high-dimensional financial data. It uses a seven-level agent hierarchy, diversified guidance, and a multi-agent quality checker to enable deeper thinking and produce economically interpretable alphas. On CSI300 A-share data, CogAlpha outperforms 19 baselines in predictive power, stability, and generalization, while also delivering interpretable alpha formulas. The work demonstrates the practical value of aligning evolutionary optimization with LLM reasoning for automated, explainable alpha discovery and plans for open-source release.
Abstract
Discovering effective predictive signals, or ``alphas,'' from financial data with high dimensionality and extremely low signal-to-noise ratio remains a difficult open problem. Despite progress in deep learning, genetic programming, and, more recently, large language model (LLM)--based factor generation, existing approaches still explore only a narrow region of the vast alpha search space. Neural models tend to produce opaque and fragile patterns, while symbolic or formula-based methods often yield redundant or economically ungrounded expressions that generalize poorly. Although different in form, these paradigms share a key limitation: none can conduct broad, structured, and human-like exploration that balances logical consistency with creative leaps. To address this gap, we introduce the Cognitive Alpha Mining Framework (CogAlpha), which combines code-level alpha representation with LLM-driven reasoning and evolutionary search. Treating LLMs as adaptive cognitive agents, our framework iteratively refines, mutates, and recombines alpha candidates through multi-stage prompts and financial feedback. This synergistic design enables deeper thinking, richer structural diversity, and economically interpretable alpha discovery, while greatly expanding the effective search space. Experiments on A-share equities demonstrate that CogAlpha consistently discovers alphas with superior predictive accuracy, robustness, and generalization over existing methods. Our results highlight the promise of aligning evolutionary optimization with LLM-based reasoning for automated and explainable alpha discovery. All source code will be released.
