Table of Contents
Fetching ...

AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution

Taian Guo, Haiyang Shen, Junyu Luo, Binqi Chen, Hongjun Ding, Jinsheng Huang, Luchen Liu, Yun Ma, Ming Zhang

TL;DR

This work introduces AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG).

Abstract

Extracting signals through alpha factor mining is a fundamental challenge in quantitative finance. Existing automated methods primarily follow two paradigms: Decoupled Factor Generation, which treats factor discovery as isolated events, and Iterative Factor Evolution, which focuses on local parent-child refinements. However, both paradigms lack a global structural view, often treating factor pools as unstructured collections or fragmented chains, which leads to redundant search and limited diversity. To address these limitations, we introduce AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG). By modeling factors as nodes and evolutionary links as edges, AlphaPROBE treats the factor pool as a dynamic, interconnected ecosystem. The framework consists of two core components: a Bayesian Factor Retriever that identifies high-potential seeds by balancing exploitation and exploration through a posterior probability model, and a DAG-aware Factor Generator that leverages the full ancestral trace of factors to produce context-aware, nonredundant optimizations. Extensive experiments on three major Chinese stock market datasets against 8 competitive baselines demonstrate that AlphaPROBE significantly gains enhanced performance in predictive accuracy, return stability and training efficiency. Our results confirm that leveraging global evolutionary topology is essential for efficient and robust automated alpha discovery. We have open-sourced our implementation at https://github.com/gta0804/AlphaPROBE.

AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution

TL;DR

This work introduces AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG).

Abstract

Extracting signals through alpha factor mining is a fundamental challenge in quantitative finance. Existing automated methods primarily follow two paradigms: Decoupled Factor Generation, which treats factor discovery as isolated events, and Iterative Factor Evolution, which focuses on local parent-child refinements. However, both paradigms lack a global structural view, often treating factor pools as unstructured collections or fragmented chains, which leads to redundant search and limited diversity. To address these limitations, we introduce AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG). By modeling factors as nodes and evolutionary links as edges, AlphaPROBE treats the factor pool as a dynamic, interconnected ecosystem. The framework consists of two core components: a Bayesian Factor Retriever that identifies high-potential seeds by balancing exploitation and exploration through a posterior probability model, and a DAG-aware Factor Generator that leverages the full ancestral trace of factors to produce context-aware, nonredundant optimizations. Extensive experiments on three major Chinese stock market datasets against 8 competitive baselines demonstrate that AlphaPROBE significantly gains enhanced performance in predictive accuracy, return stability and training efficiency. Our results confirm that leveraging global evolutionary topology is essential for efficient and robust automated alpha discovery. We have open-sourced our implementation at https://github.com/gta0804/AlphaPROBE.
Paper Structure (42 sections, 19 equations, 8 figures, 3 tables)

This paper contains 42 sections, 19 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: An example of alpha factor measuring the normalized daily price change on the Abstract Syntax Tree(AST) and expression view. All available features and operators are in the Appendix \ref{['appendix::detail:featop']}.
  • Figure 2: The overall illustration of AlphaPROBE. AlphaPROBE is a closed-loop framework consisting of a Bayesian Factor Retriever and a DAG-aware Factor Generator. The Bayesian Factor Retriever selects factors with higher potential by a trade-off among factors' intrinsic quality & diversity, topological structure, and lineage. The DAG-aware Factor Generator utilizes a multi-agent structure and factors' topology to generate new factors.
  • Figure 3: Backtest curve on the CSI 300.
  • Figure 4: Sensitivity study of AlphaPROBE.
  • Figure 5: IC dynamics of AlphaPROBE and two LLM-based methods on the CSI 300 test set over training iterations. For fair comparison, one iteration here means the backbone LLM generate a new factor to be evaluated.
  • ...and 3 more figures