Alpha Mining and Enhancing via Warm Start Genetic Programming for Quantitative Investment
Weizhe Ren, Yichen Qin, Yang Li
TL;DR
This work tackles the challenge of discovering stock alpha factors with genetic programming by introducing a Warm Start GP framework that confines search to a predefined, effective alpha structure and starts from a proven alpha. It relies on two hypotheses—structure-effectiveness and factor-effectiveness—to justify searching within a fixed structure and demonstrates that this yields a finite, denser space of viable alphas, reduced correlation among candidates, and improved out-of-sample predictive power. Empirical validation on 2020–2024 Chinese stock market data shows higher IC-based metrics and stronger portfolio performance compared with Alpha101 baselines and traditional GP, including AR > $50 ext%$ and SR > $1.0$ for larger holding portfolios. The framework thus acts as both an alpha miner and enhancer, offering interpretability and efficiency benefits, while signaling directions for more advanced aggregation models and addressing GP computational costs.
Abstract
Traditional genetic programming (GP) often struggles in stock alpha factor discovery due to its vast search space, overwhelming computational burden, and sporadic effective alphas. We find that GP performs better when focusing on promising regions rather than random searching. This paper proposes a new GP framework with carefully chosen initialization and structural constraints to enhance search performance and improve the interpretability of the alpha factors. This approach is motivated by and mimics the alpha searching practice and aims to boost the efficiency of such a process. Analysis of 2020-2024 Chinese stock market data shows that our method yields superior out-of-sample prediction results and higher portfolio returns than the benchmark.
