Enhancing Two-Player Performance Through Single-Player Knowledge Transfer: An Empirical Study on Atari 2600 Games
Kimiya Saadat, Richard Zhao
TL;DR
The paper investigates improving two-player reinforcement learning by transferring knowledge from the corresponding single-player Atari game, using RAM observations to enable efficient training. Through an empirical study across ten Atari environments, it demonstrates that transferring prelearned representations and freezing early network layers yields at least comparable, and often superior, performance with a substantial reduction in training time. It also introduces a RAM-based complexity visualization and a simple predictor indicating that higher RAM complexity can correlate with greater transfer benefits in some games. These findings suggest a practical approach to stabilizing and accelerating multi-agent RL and provide a RAM-centric tool for anticipating transfer gains.
Abstract
Playing two-player games using reinforcement learning and self-play can be challenging due to the complexity of two-player environments and the possible instability in the training process. We propose that a reinforcement learning algorithm can train more efficiently and achieve improved performance in a two-player game if it leverages the knowledge from the single-player version of the same game. This study examines the proposed idea in ten different Atari 2600 environments using the Atari 2600 RAM as the input state. We discuss the advantages of using transfer learning from a single-player training process over training in a two-player setting from scratch, and demonstrate our results in a few measures such as training time and average total reward. We also discuss a method of calculating RAM complexity and its relationship to performance.
