Robust Imitation Learning for Automated Game Testing
Pierluigi Vito Amadori, Timothy Bradley, Ryan Spick, Guy Moss
TL;DR
EVOLUTE addresses the high cost of automated game testing by proposing a two-stream imitation learning framework that splits actions into discrete and continuous components: FF-BC for discrete actions and EnergyControlled-BC (an energy-based BC) for continuous actions. The model trains the discrete stream with standard BCE-based behavioural cloning and the continuous stream with a learned energy function $E_{\theta}(\mathbf{s},\mathbf{a}_c)$ trained via Knowledge-Contrastive (InfoNCE) objectives and No-Grad/Grid-Search inference to identify low-energy actions. Empirical results in a shooting–driving game (Hardware Rivals) show EVOLUTE offers superior generalisation and exploration, achieving more kills and longer survival than a pure BC baseline, and remaining effective with limited data and even without depth information. The approach demonstrates robust automated game testing capabilities, reducing reliance on exhaustive human playtesting and offering practical benefits for quality assurance in game development.
Abstract
Game development is a long process that involves many stages before a product is ready for the market. Human play testing is among the most time consuming, as testers are required to repeatedly perform tasks in the search for errors in the code. Therefore, automated testing is seen as a key technology for the gaming industry, as it would dramatically improve development costs and efficiency. Toward this end, we propose EVOLUTE, a novel imitation learning-based architecture that combines behavioural cloning (BC) with energy based models (EBMs). EVOLUTE is a two-stream ensemble model that splits the action space of autonomous agents into continuous and discrete tasks. The EBM stream handles the continuous tasks, to have a more refined and adaptive control, while the BC stream handles discrete actions, to ease training. We evaluate the performance of EVOLUTE in a shooting-and-driving game, where the agent is required to navigate and continuously identify targets to attack. The proposed model has higher generalisation capabilities than standard BC approaches, showing a wider range of behaviours and higher performances. Also, EVOLUTE is easier to train than a pure end-to-end EBM model, as discrete tasks can be quite sparse in the dataset and cause model training to explore a much wider set of possible actions while training.
