Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

Margarita Boyarskaya; Panos Ipeirotis

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

Margarita Boyarskaya, Panos Ipeirotis

TL;DR

This work models delta-margin majority voting in crowdsourcing as a Gambler's Ruin–style absorbing Markov chain to derive exact closed-form expressions for consensus quality, expected votes to reach consensus, and the distribution of completion times. It shows how the threshold $\delta$ interacts with item difficulty and worker accuracy, and provides equivalence results to compare different worker pools and payment schemes. The framework is extended to uncertain worker quality via Bayesian priors (including Beta and mixtures), and validated with real data (Bluebirds) and Monte Carlo simulations, demonstrating strong alignment between theory and practice. Practically, the results guide ex-ante task design, budgeting, and incentive schemes, enabling quality guarantees under varying worker pools and unknown item difficulty. The work also outlines managerial implications and future directions, including extensions to non-binary and multi-stage tasks.

Abstract

In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a $δ$-margin voting process is utilized, where additional votes are solicited until a predetermined threshold $δ$ for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold $δ$ can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

TL;DR

interacts with item difficulty and worker accuracy, and provides equivalence results to compare different worker pools and payment schemes. The framework is extended to uncertain worker quality via Bayesian priors (including Beta and mixtures), and validated with real data (Bluebirds) and Monte Carlo simulations, demonstrating strong alignment between theory and practice. Practically, the results guide ex-ante task design, budgeting, and incentive schemes, enabling quality guarantees under varying worker pools and unknown item difficulty. The work also outlines managerial implications and future directions, including extensions to non-binary and multi-stage tasks.

Abstract

In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a

-margin voting process is utilized, where additional votes are solicited until a predetermined threshold

for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold

can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

TL;DR

Abstract

Full Characterization of Adaptively Strong Majority Voting in Crowdsourcing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)