Table of Contents
Fetching ...

Mitigating Cognitive Biases in Multi-Criteria Crowd Assessment

Shun Ito, Hisashi Kashima

TL;DR

This work identifies inter-criteria cognitive biases, notably halo effects, in crowdsourced multi-criteria quality assessments and demonstrates how simultaneous evaluations distort mean and variance across criteria. It introduces a Bayesian rating-aggregation framework and two targeted enhancements, an impression-based mean and a common-variance structure (ImpCDM), to mitigate these biases. Empirical results show the ImpCDM approach improves the accuracy of potential-quality estimates compared to a baseline CIM, though predicting per-criterion quality remains challenging and may benefit from modeling criterion relations. The study provides practical guidance for designing multi-criteria crowdsourcing and offers a foundation for more robust debiasing in aggregative models.

Abstract

Crowdsourcing is an easy, cheap, and fast way to perform large scale quality assessment; however, human judgments are often influenced by cognitive biases, which lowers their credibility. In this study, we focus on cognitive biases associated with a multi-criteria assessment in crowdsourcing; crowdworkers who rate targets with multiple different criteria simultaneously may provide biased responses due to prominence of some criteria or global impressions of the evaluation targets. To identify and mitigate such biases, we first create evaluation datasets using crowdsourcing and investigate the effect of inter-criteria cognitive biases on crowdworker responses. Then, we propose two specific model structures for Bayesian opinion aggregation models that consider inter-criteria relations. Our experiments show that incorporating our proposed structures into the aggregation model is effective to reduce the cognitive biases and help obtain more accurate aggregation results.

Mitigating Cognitive Biases in Multi-Criteria Crowd Assessment

TL;DR

This work identifies inter-criteria cognitive biases, notably halo effects, in crowdsourced multi-criteria quality assessments and demonstrates how simultaneous evaluations distort mean and variance across criteria. It introduces a Bayesian rating-aggregation framework and two targeted enhancements, an impression-based mean and a common-variance structure (ImpCDM), to mitigate these biases. Empirical results show the ImpCDM approach improves the accuracy of potential-quality estimates compared to a baseline CIM, though predicting per-criterion quality remains challenging and may benefit from modeling criterion relations. The study provides practical guidance for designing multi-criteria crowdsourcing and offers a foundation for more robust debiasing in aggregative models.

Abstract

Crowdsourcing is an easy, cheap, and fast way to perform large scale quality assessment; however, human judgments are often influenced by cognitive biases, which lowers their credibility. In this study, we focus on cognitive biases associated with a multi-criteria assessment in crowdsourcing; crowdworkers who rate targets with multiple different criteria simultaneously may provide biased responses due to prominence of some criteria or global impressions of the evaluation targets. To identify and mitigate such biases, we first create evaluation datasets using crowdsourcing and investigate the effect of inter-criteria cognitive biases on crowdworker responses. Then, we propose two specific model structures for Bayesian opinion aggregation models that consider inter-criteria relations. Our experiments show that incorporating our proposed structures into the aggregation model is effective to reduce the cognitive biases and help obtain more accurate aggregation results.
Paper Structure (15 sections, 8 equations, 8 figures, 9 tables)