Table of Contents
Fetching ...

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

Nikolaos Pippas, Elliot A. Ludvig, Cagatay Turkay

TL;DR

This survey critically maps Reinforcement Learning interventions onto Quantitative Finance, detailing four main RL families (Value-based, Policy-based, Actor-Critic, Model-based) and their relevance to portfolio management, hedging, execution, and market-making. It discusses how environments, actions, and rewards are constructed in finance, highlighting challenges such as non-stationarity, sample efficiency, and the simulation-to-real-world gap, while evaluating advances like transfer learning, imitation learning, and multi-agent systems. The authors provide a structured critique of current practices, emphasize the importance of risk-aware rewards and interpretability, and propose concrete directions for future work including multi-objective and hierarchical RL, richer feature sets, and robust evaluation protocols. Altogether, the work offers a practical, architecture-aware roadmap for advancing robust, transparent, and real-world-ready RL methods in financial decision-making.

Abstract

Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

TL;DR

This survey critically maps Reinforcement Learning interventions onto Quantitative Finance, detailing four main RL families (Value-based, Policy-based, Actor-Critic, Model-based) and their relevance to portfolio management, hedging, execution, and market-making. It discusses how environments, actions, and rewards are constructed in finance, highlighting challenges such as non-stationarity, sample efficiency, and the simulation-to-real-world gap, while evaluating advances like transfer learning, imitation learning, and multi-agent systems. The authors provide a structured critique of current practices, emphasize the importance of risk-aware rewards and interpretability, and propose concrete directions for future work including multi-objective and hierarchical RL, richer feature sets, and robust evaluation protocols. Altogether, the work offers a practical, architecture-aware roadmap for advancing robust, transparent, and real-world-ready RL methods in financial decision-making.

Abstract

Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.
Paper Structure (62 sections, 9 equations, 3 figures, 4 tables)

This paper contains 62 sections, 9 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The general schematic of agent-environment interaction under the RL framework SUTTON_2018.
  • Figure 2: The overview specifies how an agent and the environment interact in the QF domain using the classical RL framework depicted in Figure \ref{['fig:Base']}. With this, we map concepts, techniques and practices from QF that are identified in the survey to the components of the RL framework. (Note: Profit-based rewards can include financial gains such as dividends and pay-offs)
  • Figure 3: Timeline of important publications in QF under the RL-based framework.