Table of Contents
Fetching ...

Impacts of Racial Bias in Historical Training Data for News AI

Rahul Bhargava, Malene Hornstrup Jespersen, Emily Boardman Ndulue, Vivica Dsouza

TL;DR

<3-5 sentence high-level summary> This paper investigates how historical racial biases embedded in a widely used training corpus can propagate through AI tools in journalism. It uses a mixed-methods audit of a multi-label classifier trained on the NYT Annotated Corpus, focusing on a problematic 'blacks' label and its interpretation via LIME explanations and content analysis across multiple data sets. The findings show that the label reflects decades-old attitudes, fails to map cleanly to contemporary issues (e.g., anti-Asian discrimination, Black Lives Matter), and is sensitive to language and dataset shifts. The work argues for rigorous newsroom auditing, transparency about training data, and cautious deployment of AI tools to avoid reproducing representational harms in news analysis and discovery.

Abstract

AI technologies have rapidly moved into business and research applications that involve large text corpora, including computational journalism research and newsroom settings. These models, trained on extant data from various sources, can be conceptualized as historical artifacts that encode decades-old attitudes and stereotypes. This paper investigates one such example trained on the broadly-used New York Times Annotated Corpus to create a multi-label classifier. Our use in research settings surfaced the concerning "blacks" thematic topic label. Through quantitative and qualitative means we investigate this label's use in the training corpus, what concepts it might be encoding in the trained classifier, and how those concepts impact our model use. Via the application of explainable AI methods, we find that the "blacks" label operates partially as a general "racism detector" across some minoritized groups. However, it performs poorly against expectations on modern examples such as COVID-19 era anti-Asian hate stories, and reporting on the Black Lives Matter movement. This case study of interrogating embedded biases in a model reveals how similar applications in newsroom settings can lead to unexpected outputs that could impact a wide variety of potential uses of any large language model-story discovery, audience targeting, summarization, etc. The fundamental tension this exposes for newsrooms is how to adopt AI-enabled workflow tools while reducing the risk of reproducing historical biases in news coverage.

Impacts of Racial Bias in Historical Training Data for News AI

TL;DR

<3-5 sentence high-level summary> This paper investigates how historical racial biases embedded in a widely used training corpus can propagate through AI tools in journalism. It uses a mixed-methods audit of a multi-label classifier trained on the NYT Annotated Corpus, focusing on a problematic 'blacks' label and its interpretation via LIME explanations and content analysis across multiple data sets. The findings show that the label reflects decades-old attitudes, fails to map cleanly to contemporary issues (e.g., anti-Asian discrimination, Black Lives Matter), and is sensitive to language and dataset shifts. The work argues for rigorous newsroom auditing, transparency about training data, and cautious deployment of AI tools to avoid reproducing representational harms in news analysis and discovery.

Abstract

AI technologies have rapidly moved into business and research applications that involve large text corpora, including computational journalism research and newsroom settings. These models, trained on extant data from various sources, can be conceptualized as historical artifacts that encode decades-old attitudes and stereotypes. This paper investigates one such example trained on the broadly-used New York Times Annotated Corpus to create a multi-label classifier. Our use in research settings surfaced the concerning "blacks" thematic topic label. Through quantitative and qualitative means we investigate this label's use in the training corpus, what concepts it might be encoding in the trained classifier, and how those concepts impact our model use. Via the application of explainable AI methods, we find that the "blacks" label operates partially as a general "racism detector" across some minoritized groups. However, it performs poorly against expectations on modern examples such as COVID-19 era anti-Asian hate stories, and reporting on the Black Lives Matter movement. This case study of interrogating embedded biases in a model reveals how similar applications in newsroom settings can lead to unexpected outputs that could impact a wide variety of potential uses of any large language model-story discovery, audience targeting, summarization, etc. The fundamental tension this exposes for newsrooms is how to adopt AI-enabled workflow tools while reducing the risk of reproducing historical biases in news coverage.

Paper Structure

This paper contains 19 sections, 3 figures.

Figures (3)

  • Figure 1: Use of the label blacks and the word ”blacks” over time
  • Figure 2: Box-plots showing the distribution of predicted probabilities of being assigned the blacks label for each evaluation set. The top plot shows all predictions, the bottom plot shows predictions with probability > 0 for clearer visualization.
  • Figure 3: Mean prediction weight of the most influential 10 words across 20 samples from each of the four datasets, as calculated by LIME. Negative weights (green) indicate that the word weighs toward predicting the label blacks, positive weights (red) indicate words weighing away from the label.