FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

Yanlong Wang; Jian Xu; Hongkang Zhang; Shao-Lun Huang; Danny Dongning Sun; Xiao-Ping Zhang

FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

Yanlong Wang, Jian Xu, Hongkang Zhang, Shao-Lun Huang, Danny Dongning Sun, Xiao-Ping Zhang

TL;DR

FactorMiner addresses the challenge of discovering interpretable, formulaic alpha factors in a vast search space by introducing a memory-augmented, modular agent framework. It combines a composable Factor Mining Skill with structured Experience Memory, enabling a Ralph Loop of retrieve, generate, evaluate, and distill to continually evolve the factor library from a global perspective. Across A-share and Crypto datasets, FactorMiner demonstrates competitive, low-redundancy factor libraries (110 factors) and robust cross-market performance, aided by GPU-accelerated evaluation to scale the search. The work offers a practical, interpretable path to scalable alpha discovery and provides a reproducible artifact for hypothesis-driven market microstructure analysis, with limitations and ethical considerations highlighted for responsible use.

Abstract

Formulaic alpha factor mining is a critical yet challenging task in quantitative investment, characterized by a vast search space and the need for domain-informed, interpretable signals. However, finding novel signals becomes increasingly difficult as the library grows due to high redundancy. We propose FactorMiner, a lightweight and flexible self-evolving agent framework designed to navigate this complex landscape through continuous knowledge accumulation. FactorMiner combines a Modular Skill Architecture that encapsulates systematic financial evaluation into executable tools with a structured Experience Memory that distills historical mining trials into actionable insights (successful patterns and failure constraints). By instantiating the Ralph Loop paradigm -- retrieve, generate, evaluate, and distill -- FactorMiner iteratively uses memory priors to guide exploration, reducing redundant search while focusing on promising directions. Experiments on multiple datasets across different assets and Markets show that FactorMiner constructs a diverse library of high-quality factors with competitive performance, while maintaining low redundancy among factors as the library scales. Overall, FactorMiner provides a practical approach to scalable discovery of interpretable formulaic alpha factors under the "Correlation Red Sea" constraint.

FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

TL;DR

Abstract

Paper Structure (52 sections, 11 equations, 10 figures, 10 tables, 1 algorithm)

This paper contains 52 sections, 11 equations, 10 figures, 10 tables, 1 algorithm.

Introduction
Related Work
Automated Alpha Factor Discovery
AI Agents with Skills and Memory
Methodology
Problem Formulation
Factor Mining Skill Architecture
Experience Memory
Ralph Loop: Self-Evolving Factor Discovery
Experiments
Experimental Setup
Main Results
Factor Quality and Diversity
Robustness Across Heterogeneous Markets
Ensembles vs. Learned Selection
...and 37 more sections

Figures (10)

Figure 1: FactorMiner System Architecture. The Ralph Loop framework integrates three key components: (1) Experience Memory that stores successful patterns and forbidden regions from past mining sessions; (2) Agent Skill that encapsulates the multi-stage validation pipeline (IC screening, correlation checking, deduplication, and full validation); (3) Factor Library that grows dynamically while maintaining orthogonality constraints. The agent iteratively retrieves memory priors, generates candidates through the skill, and distills outcomes back into memory for improved future exploration.
Figure 2: Pairwise Spearman correlation heatmap of the released A-share factor library (110 admitted factors), computed from cross-sectionally standardized realized factor signals over the common time--asset panel. The average off-diagonal absolute correlation is Avg $|\rho|$ = 0.203.
Figure 3: Ablation comparison between Have Memory and No Memory. High-quality candidates are defined as those passing the IC threshold ($|\text{IC}|>0.02$). The bar chart reports the counts (high-quality / rejected / admitted) and the corresponding yield and rejection rates.
Figure 4: Grouped bar chart of computation time on a log scale for operator-level and factor-level benchmarks. Lower is better; GPU shows consistent order-of-magnitude gains.
Figure 5: IC time-series analysis for three combination methods. All methods show stable positive IC throughout the evaluation period, with IC-weighted exhibiting slightly higher peaks.
...and 5 more figures

FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

TL;DR

Abstract

FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

Authors

TL;DR

Abstract

Table of Contents

Figures (10)