Table of Contents
Fetching ...

CALE: Continuous Arcade Learning Environment

Jesse Farebrother, Pablo Samuel Castro

TL;DR

The Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE), is introduced, but adds support for continuous actions to enable the benchmarking and evaluation of continuous-control agents and value-based agents on the same environment suite.

Abstract

We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds support for continuous actions. This enables the benchmarking and evaluation of continuous-control agents (such as PPO [Schulman et al., 2017] and SAC [Haarnoja et al., 2018]) and value-based agents (such as DQN [Mnih et al., 2015] and Rainbow [Hessel et al., 2018]) on the same environment suite. We provide a series of open questions and research directions that CALE enables, as well as initial baseline results using Soft Actor-Critic. CALE is available as part of the ALE athttps://github.com/Farama-Foundation/Arcade-Learning-Environment.

CALE: Continuous Arcade Learning Environment

TL;DR

The Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE), is introduced, but adds support for continuous actions to enable the benchmarking and evaluation of continuous-control agents and value-based agents on the same environment suite.

Abstract

We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds support for continuous actions. This enables the benchmarking and evaluation of continuous-control agents (such as PPO [Schulman et al., 2017] and SAC [Haarnoja et al., 2018]) and value-based agents (such as DQN [Mnih et al., 2015] and Rainbow [Hessel et al., 2018]) on the same environment suite. We provide a series of open questions and research directions that CALE enables, as well as initial baseline results using Soft Actor-Critic. CALE is available as part of the ALE athttps://github.com/Farama-Foundation/Arcade-Learning-Environment.

Paper Structure

This paper contains 26 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Left panel: Atari CX10 controller. Right panel: Discrete joystick positions (top left) versus continuous joystick positions with varying values of the threshold $\tau$. The black circle corresponds to a joystick at position $(r,\theta)=(0.61, 2.53)$.
  • Figure 2: CALE comparison with varying $\tau$ on the 100k (left) and 200m (right) training regimes.
  • Figure 3: CALE comparison of $\phi_{SAC}$ and $\phi_{DQN}$ on the 100k (left) and 200m (right) training regimes.
  • Figure 4: CALE comparison of default SAC exploration with the more common $\epsilon$-greedy exploration used in discrete action agents on the 100k (left) and 200m (right) training regimes.
  • Figure 5: CALE comparison of SAC with DQN (using the default Dopamine implementation castro18dopamine) on a selection of games. Returns averaged over 5 independent runs, with shaded areas representing 95% confidence intervals.
  • ...and 5 more figures