The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

Nikolaos Pippas; Elliot A. Ludvig; Cagatay Turkay

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

Nikolaos Pippas, Elliot A. Ludvig, Cagatay Turkay

TL;DR

This survey critically maps Reinforcement Learning interventions onto Quantitative Finance, detailing four main RL families (Value-based, Policy-based, Actor-Critic, Model-based) and their relevance to portfolio management, hedging, execution, and market-making. It discusses how environments, actions, and rewards are constructed in finance, highlighting challenges such as non-stationarity, sample efficiency, and the simulation-to-real-world gap, while evaluating advances like transfer learning, imitation learning, and multi-agent systems. The authors provide a structured critique of current practices, emphasize the importance of risk-aware rewards and interpretability, and propose concrete directions for future work including multi-objective and hierarchical RL, richer feature sets, and robust evaluation protocols. Altogether, the work offers a practical, architecture-aware roadmap for advancing robust, transparent, and real-world-ready RL methods in financial decision-making.

Abstract

Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

TL;DR

Abstract

Paper Structure (62 sections, 9 equations, 3 figures, 4 tables)

This paper contains 62 sections, 9 equations, 3 figures, 4 tables.

Introduction
Reinforcement Learning and Finance
Temporal Dynamics in Financial Applications
Multi-Agent systems in Finance
Contribution and Paper Organisation
Critical Considerations for RL in QF
Transition from Simulation to Real-World Application
Sample Efficiency
Online vs. Offline RL Settings
On-Policy vs. Off-Policy Frameworks
Main Reinforcement Learning Methods
Value-Based Methods
Core Framework for Value-based RL in QF.
General Observations, Comments and Definitions for Value-based Methods.
Policy-Based Method
...and 47 more sections

Figures (3)

Figure 1: The general schematic of agent-environment interaction under the RL framework SUTTON_2018.
Figure 2: The overview specifies how an agent and the environment interact in the QF domain using the classical RL framework depicted in Figure \ref{['fig:Base']}. With this, we map concepts, techniques and practices from QF that are identified in the survey to the components of the RL framework. (Note: Profit-based rewards can include financial gains such as dividends and pay-offs)
Figure 3: Timeline of important publications in QF under the RL-based framework.

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

TL;DR

Abstract

The Evolution of Reinforcement Learning in Quantitative Finance: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (3)