Table of Contents
Fetching ...

Improving Generalization in Game Agents with Data Augmentation in Imitation Learning

Derek Yadgaroff, Alessandro Sestini, Konrad Tollmar, Ayca Ozcelikkale, Linus Gisslén

TL;DR

Imitation-learning agents in games often generalize poorly to unseen scenarios due to distribution shift between demonstrations and deployment. The authors propose data augmentation on feature-based state spaces within Behavioral Cloning to diversify the training distribution without changing actions, and they benchmark across four distinct 3D test environments. They conduct a large study of 38 augmentation combinations (up to 3 augmentations per example) across 228 models and demonstrate consistent generalization improvements, with some configurations achieving up to $1.6$ times the baseline performance. The work provides practical guidance for practitioners and suggests directions for broader evaluation across tasks and environments.

Abstract

Imitation learning is an effective approach for training game-playing agents and, consequently, for efficient game production. However, generalization - the ability to perform well in related but unseen scenarios - is an essential requirement that remains an unsolved challenge for game AI. Generalization is difficult for imitation learning agents because it requires the algorithm to take meaningful actions outside of the training distribution. In this paper we propose a solution to this challenge. Inspired by the success of data augmentation in supervised learning, we augment the training data so the distribution of states and actions in the dataset better represents the real state-action distribution. This study evaluates methods for combining and applying data augmentations to observations, to improve generalization of imitation learning agents. It also provides a performance benchmark of these augmentations across several 3D environments. These results demonstrate that data augmentation is a promising framework for improving generalization in imitation learning agents.

Improving Generalization in Game Agents with Data Augmentation in Imitation Learning

TL;DR

Imitation-learning agents in games often generalize poorly to unseen scenarios due to distribution shift between demonstrations and deployment. The authors propose data augmentation on feature-based state spaces within Behavioral Cloning to diversify the training distribution without changing actions, and they benchmark across four distinct 3D test environments. They conduct a large study of 38 augmentation combinations (up to 3 augmentations per example) across 228 models and demonstrate consistent generalization improvements, with some configurations achieving up to times the baseline performance. The work provides practical guidance for practitioners and suggests directions for broader evaluation across tasks and environments.

Abstract

Imitation learning is an effective approach for training game-playing agents and, consequently, for efficient game production. However, generalization - the ability to perform well in related but unseen scenarios - is an essential requirement that remains an unsolved challenge for game AI. Generalization is difficult for imitation learning agents because it requires the algorithm to take meaningful actions outside of the training distribution. In this paper we propose a solution to this challenge. Inspired by the success of data augmentation in supervised learning, we augment the training data so the distribution of states and actions in the dataset better represents the real state-action distribution. This study evaluates methods for combining and applying data augmentations to observations, to improve generalization of imitation learning agents. It also provides a performance benchmark of these augmentations across several 3D environments. These results demonstrate that data augmentation is a promising framework for improving generalization in imitation learning agents.
Paper Structure (15 sections, 1 equation, 5 figures, 3 tables)

This paper contains 15 sections, 1 equation, 5 figures, 3 tables.

Figures (5)

  • Figure 1: (a) The original environment where human demonstrations were created and used for training models. (b) The player/agent must navigate to the building, press the button to open the door, and enter the goal state before the door closes. (c) A modified version of the original environment used for testing the model's performance and ability to generalize in a new environment.
  • Figure 2: Overview of the augmentation process. We start with the original set of demonstrations. Then, depending on the number and type of the selected augmentations, we create a new, unique dataset. For example, if we select two augmentations (Gaussian noise and semantic dropout), we start from the original dataset, apply Gaussian noise to it, and then augment the resulting dataset with semantic dropout.
  • Figure 3: Relative performance of the top $20$ models, averaged over all experiments (i.e., the $4$ test environments) with standard error. Abbreviations: gauss (Gaussian noise), uni (uniform noise) sca (scaling), sm (state-mixup), drc (continuous dropout), drs (semantic dropout), rX (X percentage of the original dataset used), and eX ($\sigma = 3 \cdot 10^{-\text{X}}$). Additional details about the augmentations are provided in Section \ref{['sec:method']}.
  • Figure 4: Models that consistently outperform the baseline in the $4$ different testing environments plus the training environment. For the abbreviations legend, see Figure \ref{['fig.top.avgmodelperformance']}.
  • Figure 5: Relative performance, averaged over all environments, is consistently poor or failing for the bottom group of models. Abbreviations: gauss (Gaussian noise), uni (uniform noise) sca (scaling), sm (state-mixup), drc (continuous dropout), drs (semantic dropout), rX (X percentage of the original dataset used), and eX ($\sigma = 3 \cdot 10^{-\text{X}}$).