Table of Contents
Fetching ...

Improving Data Quality via Pre-Task Participant Screening in Crowdsourced GUI Experiments

Takaya Miyama, Satoshi Nakamura, Shota Yamanaka

TL;DR

It is shown that reducing the proportion of workers exhibiting unexpected behavior and tightening the pre-task threshold systematically improve the goodness of fit and predictive accuracy of GUI performance models, demonstrating that brief pre-task screening can enhance data quality.

Abstract

In crowdsourced user experiments that collect performance data from graphical user interface (GUI) interactions, some participants ignore instructions or act carelessly, threatening the validity of performance models. We investigate a pre-task screening method that requires simple GUI operations analogous to the main task and uses the resulting error as a continuous quality signal. Our pre-task is a brief image-resizing task in which workers match an on-screen card to a physical card; workers whose resizing error exceeds a threshold are excluded from the main experiment. The main task is a standardized pointing experiment with well-established models of movement time and error rate. Across mouse- and smartphone-based crowdsourced experiments, we show that reducing the proportion of workers exhibiting unexpected behavior and tightening the pre-task threshold systematically improve the goodness of fit and predictive accuracy of GUI performance models, demonstrating that brief pre-task screening can enhance data quality.

Improving Data Quality via Pre-Task Participant Screening in Crowdsourced GUI Experiments

TL;DR

It is shown that reducing the proportion of workers exhibiting unexpected behavior and tightening the pre-task threshold systematically improve the goodness of fit and predictive accuracy of GUI performance models, demonstrating that brief pre-task screening can enhance data quality.

Abstract

In crowdsourced user experiments that collect performance data from graphical user interface (GUI) interactions, some participants ignore instructions or act carelessly, threatening the validity of performance models. We investigate a pre-task screening method that requires simple GUI operations analogous to the main task and uses the resulting error as a continuous quality signal. Our pre-task is a brief image-resizing task in which workers match an on-screen card to a physical card; workers whose resizing error exceeds a threshold are excluded from the main experiment. The main task is a standardized pointing experiment with well-established models of movement time and error rate. Across mouse- and smartphone-based crowdsourced experiments, we show that reducing the proportion of workers exhibiting unexpected behavior and tightening the pre-task threshold systematically improve the goodness of fit and predictive accuracy of GUI performance models, demonstrating that brief pre-task screening can enhance data quality.
Paper Structure (57 sections, 5 equations, 38 figures)

This paper contains 57 sections, 5 equations, 38 figures.

Figures (38)

  • Figure 1: Overview of the proposed approach
  • Figure 2: The size-adjustment task by Li et al. li2020controlling
  • Figure 3: Size-adjustment task in Experiment 1
  • Figure 4: Pointing task in Experiment 1
  • Figure 5: Distribution of the adjusted long-side length of the card image across the two size-adjustment trials in Experiment 1.
  • ...and 33 more figures