Profit Mirage: Revisiting Information Leakage in LLM-based Financial Agents
Xiangyu Li, Yawen Zeng, Xiaofen Xing, Jin Xu, Xiangmin Xu
TL;DR
This work identifies a pervasive information leakage problem in LLM-based financial agents, causing backtested profits to vanish once knowledge windows close. It introduces FinLake-Bench as a leakage-robust benchmark and FactFin, a counterfactual evolution framework that treats LLMs as strategy generators rather than decision-makers, combining Strategy Code Generation, Retrieval-Augmented Generation, Monte Carlo Tree Search, and a Counterfactual Simulator. Through extensive experiments across six assets and multiple backbones, FactFin delivers superior out-of-sample risk-adjusted returns and substantially mitigates information leakage, outperforming baselines. The findings underscore the importance of counterfactual reasoning and strategy-driven design for robust, leakage-resilient financial forecasting with LLMs, with practical implications for deploying such models in real markets.
Abstract
LLM-based financial agents have attracted widespread excitement for their ability to trade like human experts. However, most systems exhibit a "profit mirage": dazzling back-tested returns evaporate once the model's knowledge window ends, because of the inherent information leakage in LLMs. In this paper, we systematically quantify this leakage issue across four dimensions and release FinLake-Bench, a leakage-robust evaluation benchmark. Furthermore, to mitigate this issue, we introduce FactFin, a framework that applies counterfactual perturbations to compel LLM-based agents to learn causal drivers instead of memorized outcomes. FactFin integrates four core components: Strategy Code Generator, Retrieval-Augmented Generation, Monte Carlo Tree Search, and Counterfactual Simulator. Extensive experiments show that our method surpasses all baselines in out-of-sample generalization, delivering superior risk-adjusted performance.
