Table of Contents
Fetching ...

Towards Fair Pay and Equal Work: Imposing View Time Limits in Crowdsourced Image Classification

Gordon Lim, Stefan Larson, Yu Huang, Kevin Leach

TL;DR

The paper investigates whether imposing a view time limit on crowdsourced image classification can promote fair pay and predictable compensation without sacrificing data quality. It conducts a controlled study with three view-time cohorts (100 ms, 1000 ms, 2500 ms) on the Stanford Dogs dataset, comparing results to CIFAR-10H and analyzing accuracy, consensus labeling, and worker affect. Key findings show that a 1000 ms limit yields comparable accuracy to no-limit settings, while shorter or longer limits degrade performance; a two-thirds majority consensus among workers improves final label accuracy; workers prefer shorter limits, as reflected in PANAS scores. The study offers practical guidelines for implementing time limits in crowdsourcing pipelines and discusses ethical considerations, including fair compensation and transparency, while noting limitations to single-image tasks and suggesting avenues for extending to more complex labeling tasks.

Abstract

Crowdsourcing is a common approach to rapidly annotate large volumes of data in machine learning applications. Typically, crowd workers are compensated with a flat rate based on an estimated completion time to meet a target hourly wage. Unfortunately, prior work has shown that variability in completion times among crowd workers led to overpayment by 168% in one case, and underpayment by 16% in another. However, by setting a time limit for task completion, it is possible to manage the risk of overpaying or underpaying while still facilitating flat rate payments. In this paper, we present an analysis of the impact of a time limit on crowd worker performance and satisfaction. We conducted a human study with a maximum view time for a crowdsourced image classification task. We find that the impact on overall crowd worker performance diminishes as view time increases. Despite some images being challenging under time limits, a consensus algorithm remains effective at preserving data quality and filters images needing more time. Additionally, crowd workers' consistent performance throughout the time-limited task indicates sustained effort, and their psychometric questionnaire scores show they prefer shorter limits. Based on our findings, we recommend implementing task time limits as a practical approach to making compensation more equitable and predictable.

Towards Fair Pay and Equal Work: Imposing View Time Limits in Crowdsourced Image Classification

TL;DR

The paper investigates whether imposing a view time limit on crowdsourced image classification can promote fair pay and predictable compensation without sacrificing data quality. It conducts a controlled study with three view-time cohorts (100 ms, 1000 ms, 2500 ms) on the Stanford Dogs dataset, comparing results to CIFAR-10H and analyzing accuracy, consensus labeling, and worker affect. Key findings show that a 1000 ms limit yields comparable accuracy to no-limit settings, while shorter or longer limits degrade performance; a two-thirds majority consensus among workers improves final label accuracy; workers prefer shorter limits, as reflected in PANAS scores. The study offers practical guidelines for implementing time limits in crowdsourcing pipelines and discusses ethical considerations, including fair compensation and transparency, while noting limitations to single-image tasks and suggesting avenues for extending to more complex labeling tasks.

Abstract

Crowdsourcing is a common approach to rapidly annotate large volumes of data in machine learning applications. Typically, crowd workers are compensated with a flat rate based on an estimated completion time to meet a target hourly wage. Unfortunately, prior work has shown that variability in completion times among crowd workers led to overpayment by 168% in one case, and underpayment by 16% in another. However, by setting a time limit for task completion, it is possible to manage the risk of overpaying or underpaying while still facilitating flat rate payments. In this paper, we present an analysis of the impact of a time limit on crowd worker performance and satisfaction. We conducted a human study with a maximum view time for a crowdsourced image classification task. We find that the impact on overall crowd worker performance diminishes as view time increases. Despite some images being challenging under time limits, a consensus algorithm remains effective at preserving data quality and filters images needing more time. Additionally, crowd workers' consistent performance throughout the time-limited task indicates sustained effort, and their psychometric questionnaire scores show they prefer shorter limits. Based on our findings, we recommend implementing task time limits as a practical approach to making compensation more equitable and predictable.

Paper Structure

This paper contains 21 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Experimental setup of view time limit. In this paper, we investigate 100/1000/2500ms time limits to examine their impact on data quality and worker experience.
  • Figure 2: Training procedure. Participants are shown examples of each class. They proceed after seeing all images in all classes.
  • Figure 3: Qualification and Test trials. Participants are shown a dog image and must select the best breed category. In the qualification stage, there is no time limit. In the time-limited test stage, the image disappears after 100ms, 1000ms, or 2500ms. Participants can revisit training images by clicking each category. A grey bar shows overall progress.
  • Figure 4: Participant Accuracy in CIFAR-10H and SDOGS-10H. No significant difference between CIFAR-10H and SDOGS-10H with a 1000ms view time limit suggests comparable performance at this optimal duration.
  • Figure 5: Average Time Taken on Incorrect vs. Correct Answers in CIFAR-10H. Crowd workers more often took longer on images they got incorrect on. Note: 4 points ($<0.2\%$ of data) fall outside y-axis range for clarity.
  • ...and 5 more figures