Table of Contents
Fetching ...

Agreeing and Disagreeing in Collaborative Knowledge Graph Construction: An Analysis of Wikidata

Elisavet Koutsiana, Tushita Yadav, Nitisha Jain, Albert Meroño-Peñuela, Elena Simperl

TL;DR

The paper analyzes disagreements in Wikidata discussions to understand how a large collaborative knowledge graph community manages disagreements and reaches decisions. It employs a mixed-methods approach combining descriptive statistics, thematic analysis, radial-tree metrics, and content analysis on a dataset of over $480{,}000$ threads and $1{,}000{,}000$ posts across discussion channels. Key findings show that disagreements cluster in property-related channels, most controversial threads are process-driven, and over half fail to reach consensus, with Rebels driving debate but vandalism remaining rare. The study provides methodological contributions, annotated datasets, and practical guidance for designing tools and practices to improve decision-making and communication in collaborative KG projects.

Abstract

In this work, we study disagreements in discussions around Wikidata, an online knowledge community that builds the data backend of Wikipedia. Discussions are essential in collaborative work as they can increase contributor performance and encourage the emergence of shared norms and practices. While disagreements can play a productive role in discussions, they can also lead to conflicts and controversies, which impact contributor' well-being and their motivation to engage. We want to understand if and when such phenomena arise in Wikidata, using a mix of quantitative and qualitative analyses to identify the types of topics people disagree about, the most common patterns of interaction, and roles people play when arguing for or against an issue. We find that decisions to create Wikidata properties are much faster than those to delete properties and that more than half of controversial discussions do not lead to consensus. Our analysis suggests that Wikidata is an inclusive community, considering different opinions when making decisions, and that conflict and vandalism are rare in discussions. At the same time, while one-fourth of the editors participating in controversial discussions contribute legitimate and insightful opinions about Wikidata's emerging issues, they respond with one or two posts and do not remain engaged in the discussions to reach consensus. Our work contributes to the analysis of collaborative KG construction with insights about communication and decision-making in projects, as well as with methodological directions and open datasets. We hope our findings will help managers and designers support community decision-making and improve discussion tools and practices.

Agreeing and Disagreeing in Collaborative Knowledge Graph Construction: An Analysis of Wikidata

TL;DR

The paper analyzes disagreements in Wikidata discussions to understand how a large collaborative knowledge graph community manages disagreements and reaches decisions. It employs a mixed-methods approach combining descriptive statistics, thematic analysis, radial-tree metrics, and content analysis on a dataset of over threads and posts across discussion channels. Key findings show that disagreements cluster in property-related channels, most controversial threads are process-driven, and over half fail to reach consensus, with Rebels driving debate but vandalism remaining rare. The study provides methodological contributions, annotated datasets, and practical guidance for designing tools and practices to improve decision-making and communication in collaborative KG projects.

Abstract

In this work, we study disagreements in discussions around Wikidata, an online knowledge community that builds the data backend of Wikipedia. Discussions are essential in collaborative work as they can increase contributor performance and encourage the emergence of shared norms and practices. While disagreements can play a productive role in discussions, they can also lead to conflicts and controversies, which impact contributor' well-being and their motivation to engage. We want to understand if and when such phenomena arise in Wikidata, using a mix of quantitative and qualitative analyses to identify the types of topics people disagree about, the most common patterns of interaction, and roles people play when arguing for or against an issue. We find that decisions to create Wikidata properties are much faster than those to delete properties and that more than half of controversial discussions do not lead to consensus. Our analysis suggests that Wikidata is an inclusive community, considering different opinions when making decisions, and that conflict and vandalism are rare in discussions. At the same time, while one-fourth of the editors participating in controversial discussions contribute legitimate and insightful opinions about Wikidata's emerging issues, they respond with one or two posts and do not remain engaged in the discussions to reach consensus. Our work contributes to the analysis of collaborative KG construction with insights about communication and decision-making in projects, as well as with methodological directions and open datasets. We hope our findings will help managers and designers support community decision-making and improve discussion tools and practices.
Paper Structure (28 sections, 16 figures, 7 tables)

This paper contains 28 sections, 16 figures, 7 tables.

Figures (16)

  • Figure 1: An example of the Wikidata KG for the item Ada Lovelace. We use red annotations to highlight the main features.
  • Figure 2: An example of editing a statement for the Wikidata item humanity
  • Figure 3: An example of a talk page for the Wikidata item humanity
  • Figure 4: An example of a Wikidata thread from the Property proposal discussion channel. We use annotations to highlight the thread's main features.
  • Figure 5: Heatmap with the number of different types of revisions, the number of editors, and edits per item.
  • ...and 11 more figures