Table of Contents
Fetching ...

BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service

Anna Kołos, Inez Okulska, Kinga Głąbińska, Agnieszka Karlińska, Emilia Wiśnios, Paweł Ellerik, Andrzej Prałat

TL;DR

BAN-PL introduces a substantial Polish offensive-language resource by leveraging content from Wykop.pl that was reported and banned by moderators, addressing the scarcity of public Polish data. The paper details data collection, a robust anonymization pipeline, and analysis of linguistic features, biases, and topic structure, alongside preliminary modeling on the open subset. It provides an anonymized 24k sample with accompanying preprocessing tools and discusses how moderation dynamics shape data characteristics and generalization potential. The work positions BAN-PL as a practical resource for real-world moderation research, with planned incremental releases of the full dataset and related tools to foster reproducibility and further study of hate speech and cyberbullying in Polish. Practically, BAN-PL enables researchers to study harmful content grounded in actual moderation decisions, while its anonymization framework balances research value with privacy considerations.

Abstract

Since the Internet is flooded with hate, it is one of the main tasks for NLP experts to master automated online content moderation. However, advancements in this field require improved access to publicly available accurate and non-synthetic datasets of social media content. For the Polish language, such resources are very limited. In this paper, we address this gap by presenting a new open dataset of offensive social media content for the Polish language. The dataset comprises content from Wykop.pl, a popular online service often referred to as the "Polish Reddit", reported by users and banned in the internal moderation process. It contains a total of 691,662 posts and comments, evenly divided into two categories: "harmful" and "neutral" ("non-harmful"). The anonymized subset of the BAN-PL dataset consisting on 24,000 pieces (12,000 for each class), along with preprocessing scripts have been made publicly available. Furthermore the paper offers valuable insights into real-life content moderation processes and delves into an analysis of linguistic features and content characteristics of the dataset. Moreover, a comprehensive anonymization procedure has been meticulously described and applied. The prevalent biases encountered in similar datasets, including post-moderation and pre-selection biases, are also discussed.

BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service

TL;DR

BAN-PL introduces a substantial Polish offensive-language resource by leveraging content from Wykop.pl that was reported and banned by moderators, addressing the scarcity of public Polish data. The paper details data collection, a robust anonymization pipeline, and analysis of linguistic features, biases, and topic structure, alongside preliminary modeling on the open subset. It provides an anonymized 24k sample with accompanying preprocessing tools and discusses how moderation dynamics shape data characteristics and generalization potential. The work positions BAN-PL as a practical resource for real-world moderation research, with planned incremental releases of the full dataset and related tools to foster reproducibility and further study of hate speech and cyberbullying in Polish. Practically, BAN-PL enables researchers to study harmful content grounded in actual moderation decisions, while its anonymization framework balances research value with privacy considerations.

Abstract

Since the Internet is flooded with hate, it is one of the main tasks for NLP experts to master automated online content moderation. However, advancements in this field require improved access to publicly available accurate and non-synthetic datasets of social media content. For the Polish language, such resources are very limited. In this paper, we address this gap by presenting a new open dataset of offensive social media content for the Polish language. The dataset comprises content from Wykop.pl, a popular online service often referred to as the "Polish Reddit", reported by users and banned in the internal moderation process. It contains a total of 691,662 posts and comments, evenly divided into two categories: "harmful" and "neutral" ("non-harmful"). The anonymized subset of the BAN-PL dataset consisting on 24,000 pieces (12,000 for each class), along with preprocessing scripts have been made publicly available. Furthermore the paper offers valuable insights into real-life content moderation processes and delves into an analysis of linguistic features and content characteristics of the dataset. Moreover, a comprehensive anonymization procedure has been meticulously described and applied. The prevalent biases encountered in similar datasets, including post-moderation and pre-selection biases, are also discussed.
Paper Structure (18 sections, 4 figures, 3 tables)

This paper contains 18 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Percentage of samples from the KLEJ test set labeled as offensive by three Wykop.pl moderators
  • Figure 2: Number of posts and comments from the "harmful" class by quarters
  • Figure 3: 20 topics for the anonymized BAN-PL
  • Figure 4: 20 topics for the entire dataset