A Quantitative Discourse Analysis of Asian Workers in the US Historical Newspapers
Jaihyun Park, Ryan Cordell
TL;DR
This paper tackles the understudied question of how Asian workers were depicted in U.S. historical newspapers by applying quantitative discourse analysis to the Chronicling America corpus. It combines word embeddings (Word2Vec CBOW and Skip-gram) to assess state-level semantic variation of the derogatory term 'coolie' and uses a log-odds ratio with an informative Dirichlet prior $\delta_w^{(i-j)}$ to contrast then-Confederate vs then-Union lexicons, alongside an OCR-resilient 5-gram text-reprint detector to map reprint networks. Key findings include distinct semantic contexts for MA and RI, strong Confederate associations with slavery-related vocabulary, and Union-leaning terms around labor and wages, plus a highly clustered reprint network highlighting political and cultural reuse of 'coolie' stories. The work advances digital humanities by quantifying racism toward Asians in historical media and provides openly available code to support replication and extension within historical sociolinguistics and racial discourse research.
Abstract
Warning: This paper contains examples of offensive language targetting marginalized population. The digitization of historical texts invites researchers to explore the large-scale corpus of historical texts with computational methods. In this study, we present computational text analysis on a relatively understudied topic of how Asian workers are represented in historical newspapers in the United States. We found that the word "coolie" was semantically different in some States (e.g., Massachusetts, Rhode Island, Wyoming, Oklahoma, and Arkansas) with the different discourses around coolie. We also found that then-Confederate newspapers and then-Union newspapers formed distinctive discourses by measuring over-represented words. Newspapers from then-Confederate States associated coolie with slavery-related words. In addition, we found Asians were perceived to be inferior to European immigrants and subjected to the target of racism. This study contributes to supplementing the qualitative analysis of racism in the United States with quantitative discourse analysis.
