Table of Contents
Fetching ...

A Quantitative Discourse Analysis of Asian Workers in the US Historical Newspapers

Jaihyun Park, Ryan Cordell

TL;DR

This paper tackles the understudied question of how Asian workers were depicted in U.S. historical newspapers by applying quantitative discourse analysis to the Chronicling America corpus. It combines word embeddings (Word2Vec CBOW and Skip-gram) to assess state-level semantic variation of the derogatory term 'coolie' and uses a log-odds ratio with an informative Dirichlet prior $\delta_w^{(i-j)}$ to contrast then-Confederate vs then-Union lexicons, alongside an OCR-resilient 5-gram text-reprint detector to map reprint networks. Key findings include distinct semantic contexts for MA and RI, strong Confederate associations with slavery-related vocabulary, and Union-leaning terms around labor and wages, plus a highly clustered reprint network highlighting political and cultural reuse of 'coolie' stories. The work advances digital humanities by quantifying racism toward Asians in historical media and provides openly available code to support replication and extension within historical sociolinguistics and racial discourse research.

Abstract

Warning: This paper contains examples of offensive language targetting marginalized population. The digitization of historical texts invites researchers to explore the large-scale corpus of historical texts with computational methods. In this study, we present computational text analysis on a relatively understudied topic of how Asian workers are represented in historical newspapers in the United States. We found that the word "coolie" was semantically different in some States (e.g., Massachusetts, Rhode Island, Wyoming, Oklahoma, and Arkansas) with the different discourses around coolie. We also found that then-Confederate newspapers and then-Union newspapers formed distinctive discourses by measuring over-represented words. Newspapers from then-Confederate States associated coolie with slavery-related words. In addition, we found Asians were perceived to be inferior to European immigrants and subjected to the target of racism. This study contributes to supplementing the qualitative analysis of racism in the United States with quantitative discourse analysis.

A Quantitative Discourse Analysis of Asian Workers in the US Historical Newspapers

TL;DR

This paper tackles the understudied question of how Asian workers were depicted in U.S. historical newspapers by applying quantitative discourse analysis to the Chronicling America corpus. It combines word embeddings (Word2Vec CBOW and Skip-gram) to assess state-level semantic variation of the derogatory term 'coolie' and uses a log-odds ratio with an informative Dirichlet prior to contrast then-Confederate vs then-Union lexicons, alongside an OCR-resilient 5-gram text-reprint detector to map reprint networks. Key findings include distinct semantic contexts for MA and RI, strong Confederate associations with slavery-related vocabulary, and Union-leaning terms around labor and wages, plus a highly clustered reprint network highlighting political and cultural reuse of 'coolie' stories. The work advances digital humanities by quantifying racism toward Asians in historical media and provides openly available code to support replication and extension within historical sociolinguistics and racial discourse research.

Abstract

Warning: This paper contains examples of offensive language targetting marginalized population. The digitization of historical texts invites researchers to explore the large-scale corpus of historical texts with computational methods. In this study, we present computational text analysis on a relatively understudied topic of how Asian workers are represented in historical newspapers in the United States. We found that the word "coolie" was semantically different in some States (e.g., Massachusetts, Rhode Island, Wyoming, Oklahoma, and Arkansas) with the different discourses around coolie. We also found that then-Confederate newspapers and then-Union newspapers formed distinctive discourses by measuring over-represented words. Newspapers from then-Confederate States associated coolie with slavery-related words. In addition, we found Asians were perceived to be inferior to European immigrants and subjected to the target of racism. This study contributes to supplementing the qualitative analysis of racism in the United States with quantitative discourse analysis.
Paper Structure (14 sections, 1 equation, 6 figures, 2 tables)

This paper contains 14 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The count of text data containing the word "coolie" by State
  • Figure 2: The heatmap of cosine similarity comparison across the average embedding vector of the word "coolie" in each State
  • Figure 3: The Z-score of words in then-Confederate and then-Union newspapers
  • Figure 4: The reprint network of "coolie" stories in the newspapers
  • Figure 5: The text containing "coolie" in The Opelousas courier published on July 8th, 1876
  • ...and 1 more figures