News Ninja: Gamified Annotation of Linguistic Bias in Online News
Smi Hinterreiter, Timo Spinde, Sebastian Oberdörfer, Isao Echizen, Marc Erich Latoschik
TL;DR
News Ninja tackles the scalability challenge of annotating linguistic bias in online news by combining education and data collection in a gamified framework. The system uses a tutorial, two data-annotation mechanics, five game modes, and direct/delayed feedback to train players and produce bias labels, achieving an inter-annotator agreement (IAA) on par with expert datasets ($$0.39$$) and surpassing crowdsourced baselines ($$0.21$$) with its $0.44$ score on BABE sentences. New sentences annotated by players reach IAA levels comparable to experts ($$0.399$$), and player labels achieve $79.8 ext{%}$ accuracy against new expert labels, indicating high data quality. The approach demonstrates scalable, education-enhancing crowdsourcing for linguistic bias datasets, with potential for continuous dataset updates and application to other NLP annotation tasks, while highlighting considerations around ground truth definitions and cultural biases.
Abstract
Recent research shows that visualizing linguistic bias mitigates its negative effects. However, reliable automatic detection methods to generate such visualizations require costly, knowledge-intensive training data. To facilitate data collection for media bias datasets, we present News Ninja, a game employing data-collecting game mechanics to generate a crowdsourced dataset. Before annotating sentences, players are educated on media bias via a tutorial. Our findings show that datasets gathered with crowdsourced workers trained on News Ninja can reach significantly higher inter-annotator agreements than expert and crowdsourced datasets with similar data quality. As News Ninja encourages continuous play, it allows datasets to adapt to the reception and contextualization of news over time, presenting a promising strategy to reduce data collection expenses, educate players, and promote long-term bias mitigation.
