Towards Fair Pay and Equal Work: Imposing View Time Limits in Crowdsourced Image Classification
Gordon Lim, Stefan Larson, Yu Huang, Kevin Leach
TL;DR
The paper investigates whether imposing a view time limit on crowdsourced image classification can promote fair pay and predictable compensation without sacrificing data quality. It conducts a controlled study with three view-time cohorts (100 ms, 1000 ms, 2500 ms) on the Stanford Dogs dataset, comparing results to CIFAR-10H and analyzing accuracy, consensus labeling, and worker affect. Key findings show that a 1000 ms limit yields comparable accuracy to no-limit settings, while shorter or longer limits degrade performance; a two-thirds majority consensus among workers improves final label accuracy; workers prefer shorter limits, as reflected in PANAS scores. The study offers practical guidelines for implementing time limits in crowdsourcing pipelines and discusses ethical considerations, including fair compensation and transparency, while noting limitations to single-image tasks and suggesting avenues for extending to more complex labeling tasks.
Abstract
Crowdsourcing is a common approach to rapidly annotate large volumes of data in machine learning applications. Typically, crowd workers are compensated with a flat rate based on an estimated completion time to meet a target hourly wage. Unfortunately, prior work has shown that variability in completion times among crowd workers led to overpayment by 168% in one case, and underpayment by 16% in another. However, by setting a time limit for task completion, it is possible to manage the risk of overpaying or underpaying while still facilitating flat rate payments. In this paper, we present an analysis of the impact of a time limit on crowd worker performance and satisfaction. We conducted a human study with a maximum view time for a crowdsourced image classification task. We find that the impact on overall crowd worker performance diminishes as view time increases. Despite some images being challenging under time limits, a consensus algorithm remains effective at preserving data quality and filters images needing more time. Additionally, crowd workers' consistent performance throughout the time-limited task indicates sustained effort, and their psychometric questionnaire scores show they prefer shorter limits. Based on our findings, we recommend implementing task time limits as a practical approach to making compensation more equitable and predictable.
