Introduction to Reinforcement Learning

Majid Ghasemi; Dariush Ebrahimi

Introduction to Reinforcement Learning

Majid Ghasemi, Dariush Ebrahimi

TL;DR

This paper provides a structured, beginner-friendly overview of reinforcement learning, outlining its core objects (states, actions, policies, rewards) and the MDP formalism, including the discount factor $ abla$ and return $G_t$. It surveys model-free and model-based methods across value-based, policy-based, and hybrid approaches, detailing policy evaluation, improvement, and iteration through Bellman equations. It highlights essential algorithms such as Q-learning, SARSA, DQN, PPO, and actor-critic methods, and discusses on-policy vs off-policy learning and deep RL extensions. It also offers curated resources (books, courses, and communities) to facilitate practical learning and implementation. Overall, the paper serves as a foundational gateway for newcomers to understand RL theory and start applying RL techniques.

Abstract

Reinforcement Learning (RL), a subfield of Artificial Intelligence (AI), focuses on training agents to make decisions by interacting with their environment to maximize cumulative rewards. This paper provides an overview of RL, covering its core concepts, methodologies, and resources for further learning. It offers a thorough explanation of fundamental components such as states, actions, policies, and reward signals, ensuring readers develop a solid foundational understanding. Additionally, the paper presents a variety of RL algorithms, categorized based on the key factors such as model-free, model-based, value-based, policy-based, and other key factors. Resources for learning and implementing RL, such as books, courses, and online communities are also provided. By offering a clear, structured introduction, this paper aims to simplify the complexities of RL for beginners, providing a straightforward pathway to understanding.

Introduction to Reinforcement Learning

TL;DR

and return

. It surveys model-free and model-based methods across value-based, policy-based, and hybrid approaches, detailing policy evaluation, improvement, and iteration through Bellman equations. It highlights essential algorithms such as Q-learning, SARSA, DQN, PPO, and actor-critic methods, and discusses on-policy vs off-policy learning and deep RL extensions. It also offers curated resources (books, courses, and communities) to facilitate practical learning and implementation. Overall, the paper serves as a foundational gateway for newcomers to understand RL theory and start applying RL techniques.

Abstract

Paper Structure (20 sections, 43 equations, 1 figure, 2 tables, 1 algorithm)

This paper contains 20 sections, 43 equations, 1 figure, 2 tables, 1 algorithm.

Introduction
Backgrounds & Key Concepts
Multi-Armed Bandit(s)
Markov Decision Process (MDPs)
Policies and Value Functions
Optimal Policies and Optimal Value Functions
Policy Evaluation (Prediction)
Policy Improvement
Policy Improvement Theorem
Policy Iteration
Value Iteration
Core RL Methods
Model-free & Model-based methods
Off-Policy and On-Policy Methods
Essential Algorithms
...and 5 more sections

Figures (1)

Figure 1: Overview of Reinforcement Learning sutton2018reinforcement

Introduction to Reinforcement Learning

TL;DR

Abstract

Introduction to Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (1)