Measuring the Mental Health of Content Reviewers, a Systematic Review
Alexandra Gonzalez, J. Nathan Matias
TL;DR
This paper tackles the problem of measuring the mental health of content reviewers by surveying psychological instruments from adjacent professions. Through a PRISMA-guided review of 2,249 references (1,673 unique; 143 validation studies), it identifies 12 measures across seven relevant phenomena and distinguishes clinical from research-oriented uses. The findings reveal substantial gaps in validity and cross-cultural validation for many instruments when applied to high-volume, distanced content moderation settings, underscoring the need for adapting and validating measures that reflect the specific work conditions. The authors advocate for trauma-informed, culturally sensitive measurement approaches and multi-stakeholder collaboration to advance diagnosis, workplace design, and policy, ultimately improving the safety and well-being of content reviewers.
Abstract
Artificial intelligence and social computing rely on hundreds of thousands of content reviewers to classify high volumes of harmful and forbidden content. Many workers report long-term, potentially irreversible psychological harm. This work is similar to activities that cause psychological harm to other kinds of helping professionals even after small doses of exposure. Yet researchers struggle to measure the mental health of content reviewers well enough to inform diagnoses, evaluate workplace improvements, hold employers accountable, or advance scientific understanding. This systematic review summarizes psychological measures from other professions and relates them to the experiences of content reviewers. After identifying 1,673 potential papers, we reviewed 143 that validate measures in related occupations. We summarize the uses of psychological measurement for content reviewing, differences between clinical and research measures, and 12 measures that are adaptable to content reviewing. We find serious gaps in measurement validity in regions where content review labor is common. Overall, we argue for reliable measures of content reviewer mental health that match the nature of the work and are culturally-relevant.
