Table of Contents
Fetching ...

Adaptive Crowdsourcing Via Self-Supervised Learning

Anmol Kagrecha, Henrik Marklund, Benjamin Van Roy, Hong Jun Jeon, Richard Zeckhauser

TL;DR

Predict-each-worker is developed, a new approach to crowdsourcing that leverages self-supervised learning and a novel aggregation scheme that adapts weights assigned to crowdworkers based on estimates they provided for previous quantities.

Abstract

Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across crowdworkers or their estimates correlate, the weighted sum offers a more accurate group estimate than the average. Existing algorithms such as expectation maximization can, at least in principle, produce similarly accurate group estimates. However, their computational requirements become onerous when complex models, such as neural networks, are required to express relationships among crowdworkers. Predict-each-worker accommodates such complexity as well as many other practical challenges. We analyze the efficacy of predict-each-worker through theoretical and computational studies. Among other things, we establish asymptotic optimality as the number of engagements per crowdworker grows.

Adaptive Crowdsourcing Via Self-Supervised Learning

TL;DR

Predict-each-worker is developed, a new approach to crowdsourcing that leverages self-supervised learning and a novel aggregation scheme that adapts weights assigned to crowdworkers based on estimates they provided for previous quantities.

Abstract

Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across crowdworkers or their estimates correlate, the weighted sum offers a more accurate group estimate than the average. Existing algorithms such as expectation maximization can, at least in principle, produce similarly accurate group estimates. However, their computational requirements become onerous when complex models, such as neural networks, are required to express relationships among crowdworkers. Predict-each-worker accommodates such complexity as well as many other practical challenges. We analyze the efficacy of predict-each-worker through theoretical and computational studies. Among other things, we establish asymptotic optimality as the number of engagements per crowdworker grows.
Paper Structure (24 sections, 12 theorems, 47 equations, 5 figures, 1 table, 2 algorithms)

This paper contains 24 sections, 12 theorems, 47 equations, 5 figures, 1 table, 2 algorithms.

Key Result

Theorem 3.1

If $\mathbb{P}(Y_t \in \cdot|\theta)$ is absolutely continuous with respect to the Lebesgue measure and the corresponding density has product-form support, then $\{P^{(k)}_{*}\}_{k=1}^K$ determines $\theta$.

Figures (5)

  • Figure 1: Crowdsourcing: each crowdworker provides an estimate of an unobserved quantity $Z_t$, and a center aggregates them to produce a group estimate.
  • Figure 2: The number of crowdworkers required by clairvoyant and only-skills-clairvoyant policies to match the performance of averaging over various numbers of crowdworkers. These results were generated using the Gaussian data-generating process with $N=1000$ factors and factor concentration parameter $q=1.7$. Performance was measured in terms of mean-squared error.
  • Figure 3: $K=10$
  • Figure 4: $K=20$
  • Figure 5: $K=30$

Theorems & Definitions (19)

  • Definition 3.1: product-form support
  • Theorem 3.1: sufficiency of SSL
  • Theorem 3.2
  • Theorem 3.3
  • Lemma A.1: Hammersley-Clifford Theorem
  • proof
  • Theorem A.1: sufficiency of SSL
  • proof
  • Lemma B.0
  • Theorem B.1
  • ...and 9 more