What Do We Mean by 'Pilot Study': Early Findings from a Meta-Review of Pilot Study Reporting at CHI
Belu Ticona, Amna Liaqat, Antonios Anastasopoulos
TL;DR
This study investigates how CHI pilot studies are defined and reported, revealing conceptual vagueness and inconsistent practice in human–computer interaction research. It builds CHIPS, a dataset of 904 CHI papers mentioning pilot-related terms, and applies manual coding plus LLM-based annotation to categorize reporting structures and their influence on main studies. The findings show pilots are common but rarely treated as independent study units, and pilot results are often summarized with limited detail, constraining replicability. The work advocates for community-informed reporting guidelines and outlines next steps to broaden data coverage and refine annotation methods.
Abstract
Pilot studies (PS) are ubiquitous in HCI research. CHI papers routinely reference 'pilot studies', 'pilot tests', or 'preliminary studies' to justify design decisions, verify procedures, or motivate methodological choices. Yet despite their frequency, the role of pilot studies in HCI remains conceptually vague and empirically underexamined. Unlike fields such as medicine, nursing, and education, where pilot and feasibility studies have well-established definitions, guidelines, reporting standards and even a dedicated research journal, the CHI community lacks a shared understanding of what constitutes a pilot study, why they are conducted, and how they should be reported. Many papers reference pilots 'in passing', without details about design, outcomes, or how the pilot informed the main study. This variability suggests a methodological blind spot in our community.
