Table of Contents
Fetching ...

Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning

William A. Stigall

TL;DR

This research aims to fill the gap by benchmarking the performance of both DCQNs and DTQNs across the Atari games Asteroids, Space Invaders, and Centipede, and finds that in the 35-40 million parameter range, the DCQN outperforms the DTQN in speed across both ViT and Projection Architectures.

Abstract

In this study, we investigate the performance of Deep Q-Networks utilizing Convolutional Neural Networks (CNNs) and Transformer architectures across three different Atari games. The advent of DQNs has significantly advanced Reinforcement Learning, enabling agents to directly learn optimal policies from high-dimensional sensory inputs from pixel or RAM data. While CNN-based DQNs have been extensively studied and deployed in various domains, Transformer-based DQNs are relatively unexplored. Our research aims to fill this gap by benchmarking the performance of both DCQNs and DTQNs across the Atari games Asteroids, Space Invaders, and Centipede. We find that in the 35-40 million parameter range, the DCQN outperforms the DTQN in speed across both ViT and Projection Architectures. We also find the DCQN outperforms the DTQN in all games except for Centipede.

Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning

TL;DR

This research aims to fill the gap by benchmarking the performance of both DCQNs and DTQNs across the Atari games Asteroids, Space Invaders, and Centipede, and finds that in the 35-40 million parameter range, the DCQN outperforms the DTQN in speed across both ViT and Projection Architectures.

Abstract

In this study, we investigate the performance of Deep Q-Networks utilizing Convolutional Neural Networks (CNNs) and Transformer architectures across three different Atari games. The advent of DQNs has significantly advanced Reinforcement Learning, enabling agents to directly learn optimal policies from high-dimensional sensory inputs from pixel or RAM data. While CNN-based DQNs have been extensively studied and deployed in various domains, Transformer-based DQNs are relatively unexplored. Our research aims to fill this gap by benchmarking the performance of both DCQNs and DTQNs across the Atari games Asteroids, Space Invaders, and Centipede. We find that in the 35-40 million parameter range, the DCQN outperforms the DTQN in speed across both ViT and Projection Architectures. We also find the DCQN outperforms the DTQN in all games except for Centipede.

Paper Structure

This paper contains 54 sections, 29 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Deep Q-Network architecture, emulates the traditional Q-Network defined inmnih2013playing
  • Figure 2: ViT inspired Deep Transformer Q-Network Architecture, utilizes 16x16 patches as the input for the GTrXL.
  • Figure 3: Linear Projection Deep Transformer Q-Network Architecture, increases computational efficiency by opting to project features to the embedding dimension directly.
  • Figure 4: Centipede Reward over 10k episodes. DTQN is depicted in Green and DCQN is depicted in Purple
  • Figure 5: Centipede Loss over 10k Episodes DTQN is depicted in green and DCQN is depicted in purple
  • ...and 6 more figures