Quality Control in Open-Ended Crowdsourcing: A Survey
Lei Chai, Hailong Sun, Jing Zhang
TL;DR
The paper surveys quality control for open-ended crowdsourcing, arguing that existing Boolean-task-centric approaches fail to address the large, often infinite, answer spaces and non-unique ground truths of open-ended tasks. It introduces a two-tier framework: a holistic quality model (task, worker, answer, system) and a fine-grained taxonomy of quality dimensions, evaluation metrics, and design decisions, complemented by a systematic literature review. The authors synthesize methods across task mapping, workflow design, task organization, worker expertise, contribution estimation, answer embeddings, reliability, and aggregation to offer a cohesive view of how to improve answer quality in open-ended settings. They also discuss practical challenges and future directions, especially in domain-specific design, general quality-control frameworks, intelligent workflow design, and the integration of LLMs and human feedback in open-ended crowdsourcing.
Abstract
Crowdsourcing provides a flexible approach for leveraging human intelligence to solve large-scale problems, gaining widespread acceptance in domains like intelligent information processing, social decision-making, and crowd ideation. However, the uncertainty of participants significantly compromises the answer quality, sparking substantial research interest. Existing surveys predominantly concentrate on quality control in Boolean tasks, which are generally formulated as simple label classification, ranking, or numerical prediction. Ubiquitous open-ended tasks like question-answering, translation, and semantic segmentation have not been sufficiently discussed. These tasks usually have large to infinite answer spaces and non-unique acceptable answers, posing significant challenges for quality assurance. This survey focuses on quality control methods applicable to open-ended tasks in crowdsourcing. We propose a two-tiered framework to categorize related works. The first tier introduces a holistic view of the quality model, encompassing key aspects like task, worker, answer, and system. The second tier refines the classification into more detailed categories, including quality dimensions, evaluation metrics, and design decisions, providing insights into the internal structures of the quality control framework in each aspect. We thoroughly investigate how these quality control methods are implemented in state-of-the-art works and discuss key challenges and potential future research directions.
