Table of Contents
Fetching ...

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

Angus R. Williams, Hannah Rose Kirk, Liam Burke, Yi-Ling Chung, Ivan Debono, Pica Johansson, Francesca Stevens, Jonathan Bright, Scott A. Hale

TL;DR

This work tackles the challenge of building abuse detectors that generalise across domains (sports vs politics) and demographics (women vs men) targeting public figures. It introduces the DoDo dataset (28,000 labeled tweets across four domain–demographic pairs) and evaluates transfer using multiple dodo combinations, budgets, and seeds with deBERTa-v3 fine-tuning. Key findings show that small, diverse data substantially improves generalisability, cross-demographic transfer is typically easier than cross-domain transfer, and dataset similarity predicts transfer success, offering a practical path for cost-effective, broad-spectrum abuse detection. The results have implications for policy and platform governance by informing how to deploy transferable, resource-efficient screening tools across varied target groups.

Abstract

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

TL;DR

This work tackles the challenge of building abuse detectors that generalise across domains (sports vs politics) and demographics (women vs men) targeting public figures. It introduces the DoDo dataset (28,000 labeled tweets across four domain–demographic pairs) and evaluates transfer using multiple dodo combinations, budgets, and seeds with deBERTa-v3 fine-tuning. Key findings show that small, diverse data substantially improves generalisability, cross-demographic transfer is typically easier than cross-domain transfer, and dataset similarity predicts transfer success, offering a practical path for cost-effective, broad-spectrum abuse detection. The results have implications for policy and platform governance by informing how to deploy transferable, resource-efficient screening tools across varied target groups.

Abstract

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.
Paper Structure (50 sections, 8 figures, 8 tables)

This paper contains 50 sections, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Mean and std-dev Macro-F1 across seeds for models trained on dodo combos, for fixed and full budgets, on test sets from seen and unseen dodos. *We removed one degenerate training seed (s=2).
  • Figure 2: Violin plot displaying distribution of change in Macro-F1 score when adding a dodo to the training data (7 possible scenarios), with mean represented by red marker.
  • Figure 3: Learning curves for starting with a dodo1 model trained on a single dodo pair and adding increments from the training set of a new dodo pair. We show mean and std-dev Macro-F1 (across 3 seeds) on the new adapt dodo and source start dodo at each increment.
  • Figure 4: Confusion matrices for dodo1 and dodo4 models evaluated on the total test set (12,000 entries).
  • Figure 5: Jaccard similarity and mean 0-shot Macro-F1 for dodo1 deBERT models with line of best fit. On graph annotations represent evaluation dodo. Shows positive correlation ($\rho=0.7$) and effectiveness of cross-demographic vs. cross-domain transfer.
  • ...and 3 more figures