An Investigation of the Test-Retest Reliability of the miniPXI

Aqeel Haider; Günter Wallner; Kathrin Gerling; Vero Vanden Abeele

An Investigation of the Test-Retest Reliability of the miniPXI

Aqeel Haider, Günter Wallner, Kathrin Gerling, Vero Vanden Abeele

TL;DR

This study evaluates the test-retest reliability of the miniPXI, a one-item-per-construct variant of the PXI, across four games and 100 participants, comparing it with multi-item measures and single-item proxies like NPS and an Appreciation item. Using a three-week interval, ICCs reveal that miniPXI exhibits mixed reliability (0.365–0.704 overall; Enjoyment highest at 0.704) and substantial genre- and construct-specific variation, while multi-item measures generally show higher stability. NPS and Appreciation show good test-retest reliability overall, suggesting their usefulness as global satisfaction proxies, though they may not fully capture PX dimensions. The findings underscore the dynamic nature of PX over time and advise researchers to use miniPXI cautiously for longitudinal studies, preferring multi-item measures for multidimensional constructs and reserving single-item proxies for global assessments.

Abstract

Repeated measurements of player experience are crucial in games user research, assessing how different designs evolve over time. However, this necessitates lightweight measurement instruments that are fit for the purpose. In this study, we conduct an examination of the test-retest reliability of the \emph{miniPXI} -- a short variant of the \emph{Player Experience Inventory} (\emph{PXI}), an established measure for measuring player experience. We analyzed test-retest reliability by leveraging four games involving 100 participants, comparing it with four established multi-item measures and single-item indicators such as the Net Promoter Score (\emph{NPS}) and overall enjoyment. The findings show mixed outcomes; the \emph{miniPXI} demonstrated varying levels of test-retest reliability. Some constructs showed good to moderate reliability, while others were less consistent. On the other hand, multi-item measures exhibited moderate to good test-retest reliability, demonstrating their effectiveness in measuring player experiences over time. Additionally, the employed single-item indicators (\emph{NPS} and overall enjoyment) demonstrated good reliability. The results of our study highlight the complexity of player experience evaluations over time, utilizing single and multiple items per construct measures. We conclude that single-item measures may not be appropriate for long-term investigations of more complex PX dimensions and provide practical considerations for the applicability of such measures in repeated measurements.

An Investigation of the Test-Retest Reliability of the miniPXI

TL;DR

Abstract

An Investigation of the Test-Retest Reliability of the miniPXI

Authors

TL;DR

Abstract

Table of Contents

Figures (2)