Design and Evaluation of Crowd-sourcing Platforms Based on Users Confidence Judgments

Samin Nili Ahmadabadi; Maryam Haghifam; Vahid Shah-Mansouri; Sara Ershadmanesh

Design and Evaluation of Crowd-sourcing Platforms Based on Users Confidence Judgments

Samin Nili Ahmadabadi, Maryam Haghifam, Vahid Shah-Mansouri, Sara Ershadmanesh

TL;DR

This paper addresses whether incorporating users' confidence judgments and metacognitive ability can improve crowdsourcing accuracy beyond standard majority voting. It introduces two systems: ReBaCS (response-based MV) and CoBaCS (confidence-weighted WMV), supported by a probabilistic model of Type I and II decisions and analytic error expressions that use normal approximations. The authors derive concrete formulas for system error, conduct simulations, and perform a real-world experiment (memory and tweet-based tasks) with 86 participants to compare performance. Results show CoBaCS frequently outperforms ReBaCS, particularly when expert crowd members are scarce, demonstrating the practical value of leveraging metacognition in crowd-based decision making; the work also suggests metacognition can be measured once and applied across tasks to guide participant selection and weighting.

Abstract

Crowd-sourcing deals with solving problems by assigning them to a large number of non-experts called crowd using their spare time. In these systems, the final answer to the question is determined by summing up the votes obtained from the community. The popularity of using these systems has increased by facilitation of access to community members through mobile phones and the Internet. One of the issues raised in crowd-sourcing is how to choose people and how to collect answers. Usually, the separation of users is done based on their performance in a pre-test. Designing the pre-test for performance calculation is challenging; The pre-test questions should be chosen in a way that they test the characteristics in people related to the main questions. One of the ways to increase the accuracy of crowd-sourcing systems is to pay attention to people's cognitive characteristics and decision-making model to form a crowd and improve the estimation of the accuracy of their answers to questions. People can estimate the correctness of their responses while making a decision. The accuracy of this estimate is determined by a quantity called metacognition ability. Metacoginition is referred to the case where the confidence level is considered along with the answer to increase the accuracy of the solution. In this paper, by both mathematical and experimental analysis, we would answer the following question: Is it possible to improve the performance of the crowd-sourcing system by knowing the metacognition of individuals and recording and using the users' confidence in their answers?

Design and Evaluation of Crowd-sourcing Platforms Based on Users Confidence Judgments

TL;DR

Abstract

Paper Structure (21 sections, 17 equations, 6 figures, 4 tables)

This paper contains 21 sections, 17 equations, 6 figures, 4 tables.

Introduction
Background
Crowdsourcing Systems
Steps of a crowdsourcing system
Meta-cognitive ability
Metacognition in Crowdsourcing Systems
Decision-Making Model
Type I Decision
Type II Decision
Method
Response Based CrowdSourcing System (ReBaCS)
Error Calculation
CoBaCS
Error Calculation
Evaluations Environment
...and 6 more sections

Figures (6)

Figure 1: Decision-making model. A normal random variable, $x$, is generated for each question. Type I decision is made based on the value of $x$ and the decision-making threshold, $c1$. The same happens with Type II decision, confidence, due to $x$ and Type II decision-making threshold, $c2|A$. The green area shows the probability of answering correctly if the answer is option two and the confidence is reported as high. Also, the red area is equal to the probability of being wrong, with the true answer being option 2, reporting low confidence.
Figure 2: Performance of ReBaCS and CoBaCS in various populations: without filtering users at the beginning of the task, CoBaCS has no superiority over ReBaCS.
Figure 3: ReBaCS and CoBaCS exhibit divergent performance patterns based on user populations. CoBaCS excels when low meta-cognitive ability users are absent, while ReBaCS outperforms in the presence of expert users. The study employs histograms and percentages to illustrate these performance dynamics across various filters.
Figure 4: The Memory Task comprised two phases: an encoding phase where subjects memorized a random selection of words and a recall phase where subjects reported whether a presented word was seen in the encoding phase and rated their confidence in their response. Persian words, representing concepts like the sun rising, flight, and fountain, were used in the task.
Figure 5: Tweet Task. This Task had 100 questions, and each question contained three tweets, all from an account. The subjects were asked to guess the gender of the owner of the account. Then, they rated their confidence in being correct from 1 to 5.
...and 1 more figures

Design and Evaluation of Crowd-sourcing Platforms Based on Users Confidence Judgments

TL;DR

Abstract

Design and Evaluation of Crowd-sourcing Platforms Based on Users Confidence Judgments

Authors

TL;DR

Abstract

Table of Contents

Figures (6)