Table of Contents
Fetching ...

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

Margarita Boyarskaya, Panos Ipeirotis

TL;DR

This work models delta-margin majority voting in crowdsourcing as a Gambler's Ruin–style absorbing Markov chain to derive exact closed-form expressions for consensus quality, expected votes to reach consensus, and the distribution of completion times. It shows how the threshold $\delta$ interacts with item difficulty and worker accuracy, and provides equivalence results to compare different worker pools and payment schemes. The framework is extended to uncertain worker quality via Bayesian priors (including Beta and mixtures), and validated with real data (Bluebirds) and Monte Carlo simulations, demonstrating strong alignment between theory and practice. Practically, the results guide ex-ante task design, budgeting, and incentive schemes, enabling quality guarantees under varying worker pools and unknown item difficulty. The work also outlines managerial implications and future directions, including extensions to non-binary and multi-stage tasks.

Abstract

In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a $δ$-margin voting process is utilized, where additional votes are solicited until a predetermined threshold $δ$ for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold $δ$ can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

TL;DR

This work models delta-margin majority voting in crowdsourcing as a Gambler's Ruin–style absorbing Markov chain to derive exact closed-form expressions for consensus quality, expected votes to reach consensus, and the distribution of completion times. It shows how the threshold interacts with item difficulty and worker accuracy, and provides equivalence results to compare different worker pools and payment schemes. The framework is extended to uncertain worker quality via Bayesian priors (including Beta and mixtures), and validated with real data (Bluebirds) and Monte Carlo simulations, demonstrating strong alignment between theory and practice. Practically, the results guide ex-ante task design, budgeting, and incentive schemes, enabling quality guarantees under varying worker pools and unknown item difficulty. The work also outlines managerial implications and future directions, including extensions to non-binary and multi-stage tasks.

Abstract

In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a -margin voting process is utilized, where additional votes are solicited until a predetermined threshold for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.

Paper Structure

This paper contains 35 sections, 26 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: A Markov chain diagram illustrating state transitions for $\delta$-margin majority voting on a single item, for which the average worker pool accuracy is $p$. Node labels indicate the difference between the numbers of correct ($n_1$) and incorrect ($n_0$) votes, $n_1-n_0$. Consensus is reached in absorbing states $\delta$ and $-\delta$, the former resulting in a correct consensus vote and the latter in an incorrect one.
  • Figure 2: Theoretical values of quality $Q$ of consensus vote (Theorem \ref{['th:Q_nonrand']}) as a function of the probability of a correct answer $p$, for a fixed consensus threshold $\delta$.
  • Figure 3: Expected time to reach consensus as a function of the probability of correct answer $p$, for a fixed consensus threshold $\delta$.
  • Figure 4: Left: $Var(n_{votes})$ (left) and $Var(n_{votes})/\mathbb{E}(n_{votes})$ (right) as a function of probability of correct answer $p$, for a fixed consensus threshold $\delta$ ($d$). (Note the differing scales.)
  • Figure 5: Expected time until reaching consensus, with standard deviation bounds, for selected values of $\delta$.
  • ...and 9 more figures