BiaSWE: An Expert Annotated Dataset for Misogyny Detection in Swedish
Kätriin Kukk, Danila Petrelli, Judit Casademont, Eric J. W. Orlowski, Michał Dzieliński, Maria Jacobson
TL;DR
This work addresses the challenge of detecting misogyny in Swedish, a low-resource language with strong cultural context, by creating BiaSWE, a small expert-annotated dataset and a transparent annotation process. The authors combine data from the Flashback forum with keyword-driven sampling and a four-task annotation workflow guided by a Swedish misogyny taxonomy, using Label Studio and interdisciplinary expert input. Key contributions include the BiaSWE dataset, the detailed guidelines for its annotation, and a methodological framework for interdisciplinary bias detection in Swedish. The study demonstrates substantial inter-annotator agreement and provides a replicable path toward culturally informed misogyny detection in under-resourced languages, while acknowledging limitations and outlining concrete directions for expansion and refinement.
Abstract
In this study, we introduce the process for creating BiaSWE, an expert-annotated dataset tailored for misogyny detection in the Swedish language. To address the cultural and linguistic specificity of misogyny in Swedish, we collaborated with experts from the social sciences and humanities. Our interdisciplinary team developed a rigorous annotation process, incorporating both domain knowledge and language expertise, to capture the nuances of misogyny in a Swedish context. This methodology ensures that the dataset is not only culturally relevant but also aligned with broader efforts in bias detection for low-resource languages. The dataset, along with the annotation guidelines, is publicly available for further research.
