Table of Contents
Fetching ...

Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari Games

Ke Yan

TL;DR

This study systematically compares Decision Transformer (DT) and Decision Mamba (DM) on a broad Atari suite to uncover when sequence-modeling RL methods excel. It reveals that action-space complexity and visual complexity jointly drive the performance gap, with DM favoured in simpler environments and DT in more complex ones. Using random forest regression, SHAP values, and correlation analyses, the work highlights action count and compression ratio as primary determinants, and demonstrates that simplifying the action space via fusion shifts relative performance, underscoring the importance of environment characteristics for model design. The findings offer guidance for designing robust sequence-modeling RL systems and motivate exploring hybrids that leverage both architectures across diverse, complex tasks.

Abstract

This work analyses the disparity in performance between Decision Transformer (DT) and Decision Mamba (DM) in sequence modelling reinforcement learning tasks for different Atari games. The study first observed that DM generally outperformed DT in the games Breakout and Qbert, while DT performed better in more complicated games, such as Hero and Kung Fu Master. To understand these differences, we expanded the number of games to 12 and performed a comprehensive analysis of game characteristics, including action space complexity, visual complexity, average trajectory length, and average steps to the first non-zero reward. In order to further analyse the key factors that impact the disparity in performance between DT and DM, we employ various approaches, including quantifying visual complexity, random forest regression, correlation analysis, and action space simplification strategies. The results indicate that the performance gap between DT and DM is affected by the complex interaction of multiple factors, with the complexity of the action space and visual complexity (particularly evaluated by compression ratio) being the primary determining factors. DM performs well in environments with simple action and visual elements, while DT shows an advantage in games with higher action and visual complexity. Our findings contribute to a deeper understanding of how the game characteristics affect the performance difference in sequential modelling reinforcement learning, potentially guiding the development of future model design and applications for diverse and complex environments.

Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari Games

TL;DR

This study systematically compares Decision Transformer (DT) and Decision Mamba (DM) on a broad Atari suite to uncover when sequence-modeling RL methods excel. It reveals that action-space complexity and visual complexity jointly drive the performance gap, with DM favoured in simpler environments and DT in more complex ones. Using random forest regression, SHAP values, and correlation analyses, the work highlights action count and compression ratio as primary determinants, and demonstrates that simplifying the action space via fusion shifts relative performance, underscoring the importance of environment characteristics for model design. The findings offer guidance for designing robust sequence-modeling RL systems and motivate exploring hybrids that leverage both architectures across diverse, complex tasks.

Abstract

This work analyses the disparity in performance between Decision Transformer (DT) and Decision Mamba (DM) in sequence modelling reinforcement learning tasks for different Atari games. The study first observed that DM generally outperformed DT in the games Breakout and Qbert, while DT performed better in more complicated games, such as Hero and Kung Fu Master. To understand these differences, we expanded the number of games to 12 and performed a comprehensive analysis of game characteristics, including action space complexity, visual complexity, average trajectory length, and average steps to the first non-zero reward. In order to further analyse the key factors that impact the disparity in performance between DT and DM, we employ various approaches, including quantifying visual complexity, random forest regression, correlation analysis, and action space simplification strategies. The results indicate that the performance gap between DT and DM is affected by the complex interaction of multiple factors, with the complexity of the action space and visual complexity (particularly evaluated by compression ratio) being the primary determining factors. DM performs well in environments with simple action and visual elements, while DT shows an advantage in games with higher action and visual complexity. Our findings contribute to a deeper understanding of how the game characteristics affect the performance difference in sequential modelling reinforcement learning, potentially guiding the development of future model design and applications for diverse and complex environments.

Paper Structure

This paper contains 40 sections, 8 equations, 11 figures, 16 tables, 1 algorithm.

Figures (11)

  • Figure 1: Mamba block, $\sigma$ is SiLU function, and $\otimes$ stands for elementwise multiplication.
  • Figure 2: The Architecture of Decision Transformer (Left) and Decision Mamba (Right). $N$ represents normalization layers, activation function $\sigma$ stands for GELU (Gaussian Error Linear Unit), and $+$ are addition operations used for skip connections.
  • Figure 3: Feature importance of each game metrics. Higher scores indicate a greater influence on model performance difference.
  • Figure 4: SHAP value feature importance. Higher values suggest a greater influence.
  • Figure 5: Correlation Matrix of Performance Difference and the Game Metrics. The colour intensity represents the strength of the correlation, with red indicating positive correlations and blue indicating negative correlations.
  • ...and 6 more figures