Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels

Bei Zhou; Soren Riis

Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels

Bei Zhou, Soren Riis

TL;DR

The paper frames parity learning in impartial games as a bottleneck for reinforcement learning with self-play, focusing on how label noise interacts with curriculum-like exposure in bitstring representations of game states. It introduces a latent-curriculum approach and controlled noisy labels, analyzes across bitstrings up to length $n=100$, and uses a single-layer LSTM with binary cross-entropy to quantify learning dynamics, including a gradient-information bound $\mathrm{Var}(\mathcal{H}, F, \mathbf{w}) \le \frac{C}{2^n}$ under uniform data. Key findings show that learning deteriorates as bitstring length grows and that more than 5% noise on long bitstrings prevents parity modeling, while latent curricula can mitigate some difficulties but require substantial training. Overall, the work highlights practical implications for improving self-play RL in impartial games and provides a framework for evaluating resilience to noisy labels.

Abstract

This paper delves into applying reinforcement learning (RL) in strategy games, particularly those characterized by parity challenges, as seen in specific positions of Go and Chess and a broader range of impartial games. We propose a simulated learning process, structured within a curriculum learning framework and augmented with noisy labels, to mirror the intricacies of self-play learning scenarios. This approach thoroughly analyses how neural networks (NNs) adapt and evolve from elementary to increasingly complex game positions. Our empirical research indicates that even minimal label noise can significantly impede NNs' ability to discern effective strategies, a difficulty that intensifies with the growing complexity of the game positions. These findings underscore the urgent need for advanced methodologies in RL training, specifically tailored to counter the obstacles imposed by noisy evaluations. The development of such methodologies is crucial not only for enhancing NN proficiency in strategy games with significant parity elements but also for broadening the resilience and efficiency of RL systems across diverse and complex environments.

Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels

TL;DR

, and uses a single-layer LSTM with binary cross-entropy to quantify learning dynamics, including a gradient-information bound

under uniform data. Key findings show that learning deteriorates as bitstring length grows and that more than 5% noise on long bitstrings prevents parity modeling, while latent curricula can mitigate some difficulties but require substantial training. Overall, the work highlights practical implications for improving self-play RL in impartial games and provides a framework for evaluating resilience to noisy labels.

Abstract

Paper Structure (9 sections, 6 equations, 2 figures)

This paper contains 9 sections, 6 equations, 2 figures.

Introduction
Related Work
Learning parity function with uniform data
Learning parity function with non-uniform data
Learning parity function with various NN architectures
Learning parity function with noisy labels
Learning parity function from latent curriculum
Learning parity function from latent curriculum with noisy labels
Conclusion

Figures (2)

Figure 1: Under the dataset generated by the latent curriculum, 10 experiments with 10 random seeds were conducted for each given bitstring length.
Figure 2: The maximum percentage of the noisy labels in the dataset on which the LSTM model can model the parity function with more than 95% accuracy across bitstrings whose length ranges from 20 to 100.

Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels

TL;DR

Abstract

Exploring Parity Challenges in Reinforcement Learning through Curriculum Learning with Noisy Labels

Authors

TL;DR

Abstract

Table of Contents

Figures (2)