Taboo and Collaborative Knowledge Production: Evidence from Wikipedia
Kaylea Champion, Benjamin Mako Hill
TL;DR
This study develops a Wiktionary-based method to identify taboo subjects and applies it to English Wikipedia to compare taboo and non-taboo articles. Using longitudinal Wikipedia data, the authors test five hypotheses about readership, contributions, quality, and identifiability, finding that taboo articles are more popular and higher quality yet attract more vandalism and low-quality edits. The results reveal a nuanced privacy-identity dynamic among contributors, with mixed support for identifiability concerns. The work offers design implications for privacy in peer-produced knowledge systems and provides a scalable approach to studying taboo across languages and platforms.
Abstract
By definition, people are reticent or even unwilling to talk about taboo subjects. Because subjects like sexuality, health, and violence are taboo in most cultures, important information on each of these subjects can be difficult to obtain. Are peer produced knowledge bases like Wikipedia a promising approach for providing people with information on taboo subjects? With its reliance on volunteers who might also be averse to taboo, can the peer production model produce high-quality information on taboo subjects? In this paper, we seek to understand the role of taboo in knowledge bases produced by volunteers. We do so by developing a novel computational approach to identify taboo subjects and by using this method to identify a set of articles on taboo subjects in English Wikipedia. We find that articles on taboo subjects are more popular than non-taboo articles and that they are frequently vandalized. Despite frequent vandalism attacks, we also find that taboo articles are higher quality than non-taboo articles. We hypothesize that stigmatizing societal attitudes will lead contributors to taboo subjects to seek to be less identifiable. Although our results are consistent with this proposal in several ways, we surprisingly find that contributors make themselves more identifiable in others.
