MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model

Yifei Chen; Sarra Habchi; Lili Wei

MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model

Yifei Chen, Sarra Habchi, Lili Wei

TL;DR

MIMIC tackles the limitation of homogeneous strategies in automated game testing by embedding diverse human-like playstyles into a large language model framework. It combines a personality-driven Hybrid Planner, a memory system, and a reflective summarizer to generate and execute varied task strategies across multiple games, including Minecraft. Empirical results show superior code- and interaction-level coverage compared with baselines, and higher task success and behavioral diversity than a state-of-the-art Minecraft agent on a comprehensive task suite. The work demonstrates that personality-aware automation can yield richer testing trajectories, better edge-case discovery, and broader applicability beyond gaming to automated UI testing and HCI contexts.

Abstract

Modern video games pose significant challenges for traditional automated testing algorithms, yet intensive testing is crucial to ensure game quality. To address these challenges, researchers designed gaming agents using Reinforcement Learning, Imitation Learning, or Large Language Models. However, these agents often neglect the diverse strategies employed by human players due to their different personalities, resulting in repetitive solutions in similar situations. Without mimicking varied gaming strategies, these agents struggle to trigger diverse in-game interactions or uncover edge cases. In this paper, we present MIMIC, a novel framework that integrates diverse personality traits into gaming agents, enabling them to adopt different gaming strategies for similar situations. By mimicking different playstyles, MIMIC can achieve higher test coverage and richer in-game interactions across different games. It also outperforms state-of-the-art agents in Minecraft by achieving a higher task completion rate and providing more diverse solutions. These results highlight MIMIC's significant potential for effective game testing.

MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model

TL;DR

Abstract

MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)