An Introduction to Deep Reinforcement Learning
Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
TL;DR
This paper surveys deep reinforcement learning, framing it as the combination of reinforcement learning with deep neural networks to solve high-dimensional sequential decision problems. It categorizes approaches into value-based (e.g., DQN and its variants), policy-gradient (including actor-critic and natural/trust-region methods), and model-based methods, and discusses how each scales with neural function approximators. It emphasizes generalization as a core challenge, detailing feature selection, auxiliary tasks, objective shaping, and hierarchical learning as mechanisms to balance bias and overfitting across offline and online settings. The survey also covers benchmarking practices, exploration strategies, and the role of non-MDP settings (POMDPs, transfer/continual learning, demonstrations, and multi-agent systems) in broadening deep RL applicability, with insights into safety, reliability, and societal impact. Overall, it highlights how integrating model-based and model-free approaches, improving sample efficiency, and pursuing meta-learning and curriculum strategies are key directions for advancing deep RL toward robust, real-world deployment, while noting the need for careful evaluation and ethical considerations.
Abstract
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision-making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts.
