Table of Contents
Fetching ...

A Holistic Indicator of Polarization to Measure Online Sexism

Vahid Ghafouri, Jose Such, Guillermo Suarez-Tangil

TL;DR

The paper tackles the challenge of measuring online sexism with a comparable holistic indicator across communities. It fuses a supervised toxicity detector with an unsupervised WEAT-inspired bias test to compute for each adjective the components $T_{w_i}$, $FPR_{w_i}$, and $B_{w_i,S_A|S_B}$, which are combined into $TargetedToxicity_{S_A|S_B}$ using $TargetedToxicity_{S_A|S_B} = \frac{\sum_i B_{w_i,S_A|S_B} \cdot FPR_{w_i} \cdot T_{w_i}}{|\{w_i\}|}$. The model distinguishes toxicity toward gender identity versus individual figures and applies to both male- and female-related attribute sets, enabling fine-grained comparisons across 14 subreddits. Empirical evaluation shows strong alignment with expert assessments and high discriminative power, while offering lower computational cost than large transformers. The approach generalizes to other polarization analyses and supports policy and moderation decisions, with code and data publicly available on GitHub.

Abstract

The online trend of the manosphere and feminist discourse on social networks requires a holistic measure of the level of sexism in an online community. This indicator is important for policymakers and moderators of online communities (e.g., subreddits) and computational social scientists, either to revise moderation strategies based on the degree of sexism or to match and compare the temporal sexism across different platforms and communities with real-time events and infer social scientific insights. In this paper, we build a model that can provide a comparable holistic indicator of toxicity targeted toward male and female identity and male and female individuals. Despite previous supervised NLP methods that require annotation of toxic comments at the target level (e.g. annotating comments that are specifically toxic toward women) to detect targeted toxic comments, our indicator uses supervised NLP to detect the presence of toxicity and unsupervised word embedding association test to detect the target automatically. We apply our model to gender discourse communities (e.g., r/TheRedPill, r/MGTOW, r/FemaleDatingStrategy) to detect the level of toxicity toward genders (i.e., sexism). Our results show that our framework accurately and consistently (93% correlation) measures the level of sexism in a community. We finally discuss how our framework can be generalized in the future to measure qualities other than toxicity (e.g. sentiment, humor) toward general-purpose targets and turn into an indicator of different sorts of polarizations.

A Holistic Indicator of Polarization to Measure Online Sexism

TL;DR

The paper tackles the challenge of measuring online sexism with a comparable holistic indicator across communities. It fuses a supervised toxicity detector with an unsupervised WEAT-inspired bias test to compute for each adjective the components , , and , which are combined into using . The model distinguishes toxicity toward gender identity versus individual figures and applies to both male- and female-related attribute sets, enabling fine-grained comparisons across 14 subreddits. Empirical evaluation shows strong alignment with expert assessments and high discriminative power, while offering lower computational cost than large transformers. The approach generalizes to other polarization analyses and supports policy and moderation decisions, with code and data publicly available on GitHub.

Abstract

The online trend of the manosphere and feminist discourse on social networks requires a holistic measure of the level of sexism in an online community. This indicator is important for policymakers and moderators of online communities (e.g., subreddits) and computational social scientists, either to revise moderation strategies based on the degree of sexism or to match and compare the temporal sexism across different platforms and communities with real-time events and infer social scientific insights. In this paper, we build a model that can provide a comparable holistic indicator of toxicity targeted toward male and female identity and male and female individuals. Despite previous supervised NLP methods that require annotation of toxic comments at the target level (e.g. annotating comments that are specifically toxic toward women) to detect targeted toxic comments, our indicator uses supervised NLP to detect the presence of toxicity and unsupervised word embedding association test to detect the target automatically. We apply our model to gender discourse communities (e.g., r/TheRedPill, r/MGTOW, r/FemaleDatingStrategy) to detect the level of toxicity toward genders (i.e., sexism). Our results show that our framework accurately and consistently (93% correlation) measures the level of sexism in a community. We finally discuss how our framework can be generalized in the future to measure qualities other than toxicity (e.g. sentiment, humor) toward general-purpose targets and turn into an indicator of different sorts of polarizations.
Paper Structure (19 sections, 4 equations, 7 figures, 2 tables)

This paper contains 19 sections, 4 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Outlook of our processing pipeline.
  • Figure 2: Processing pipeline for building our Toxicity-Detector NLP model.
  • Figure 3: Validation Chart for Our Sexism Metric for Toxicity Toward Female Identity
  • Figure 4: Toxicity Targeted Toward Male Identity.
  • Figure 5: Toxicity Targeted Toward Female Identity.
  • ...and 2 more figures