Self Paced Gaussian Contextual Reinforcement Learning

Mohsen Sahraei Ardakani; Rui Song

Self Paced Gaussian Contextual Reinforcement Learning

Mohsen Sahraei Ardakani, Rui Song

Abstract

Curriculum learning improves reinforcement learning (RL) efficiency by sequencing tasks from simple to complex. However, many self-paced curriculum methods rely on computationally expensive inner-loop optimizations, limiting their scalability in high-dimensional context spaces. In this paper, we propose Self-Paced Gaussian Curriculum Learning (SPGL), a novel approach that avoids costly numerical procedures by leveraging a closed-form update rule for Gaussian context distributions. SPGL maintains the sample efficiency and adaptability of traditional self-paced methods while substantially reducing computational overhead. We provide theoretical guarantees on convergence and validate our method across several contextual RL benchmarks, including the Point Mass, Lunar Lander, and Ball Catching environments. Experimental results show that SPGL matches or outperforms existing curriculum methods, especially in hidden context scenarios, and achieves more stable context distribution convergence. Our method offers a scalable, principled alternative for curriculum generation in challenging continuous and partially observable domains.

Self Paced Gaussian Contextual Reinforcement Learning

Abstract

Paper Structure (16 sections, 8 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 16 sections, 8 equations, 5 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Self Paced Gaussian Curriculum Learning
Performance Optimization Problem
Context Distribution Convergence Problem
Algorithmic Realization
Experiments
Point Mass Environment
Lunar Lander Environment
Ball Catching Environment
Conclusion
Lemma 1. Proof
Lemma 2. Proof
Corollary to Lemma 2. Proof
...and 1 more sections

Figures (5)

Figure 1: Point mass environment with initial context sample Figure \ref{['fig:pm1']} and target context sample in Figure \ref{['fig:pm2']}
Figure 2: Point Mass setup 1 experiment with hidden context curriculum learning and training
Figure 3: Point Mass setup 2 experiment with visible context curriculum learning and training
Figure 4: Lunar Lander experiment with hidden context curriculum learning and training
Figure 5: Ball Catching experiment with hidden context curriculum learning and training

Self Paced Gaussian Contextual Reinforcement Learning

Abstract

Self Paced Gaussian Contextual Reinforcement Learning

Authors

Abstract

Table of Contents

Figures (5)