Table of Contents
Fetching ...

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings

Shintaro Sakai, Haewoon Kwak, Jisun An, Akira Matsui

TL;DR

The paper develops a diachronic, corpus-based approach to quantify gender stereotypes in Japan from 1900 to 1999 using 100 year-specific word embeddings and a WEAT-inspired bias measure. It constructs domain and occupation targets, bootstraps stereotype estimates, and analyzes shifts in Home, Work, Politics, and 18 occupations, linking linguistic trends to demographic changes and Allied occupation reforms. The results show Work and Politics becoming more female-stereotyped after mid-century, Home remaining strongly female-associated, and occupational stereotypes gradually aligning with women’s labor-force participation, though most occupations remain male-stereotyped on average; a corpus-wide trend suggests broader linguistic shifts beyond the targeted domains. The study highlights partial reflection of real-world gender changes in language, while also revealing enduring cultural norms and methodological considerations when using historical embeddings to study social stereotypes.

Abstract

We quantify the evolution of gender stereotypes in Japan from 1900 to 1999 using a series of 100 word embeddings, each trained on a corpus from a specific year. We define the gender stereotype value to measure the strength of a word's gender association by computing the difference in cosine similarity of the word to female- versus male-related attribute words. We examine trajectories of gender stereotype across three traditionally gendered domains: Home, Work, and Politics, as well as occupations. The results indicate that language-based gender stereotypes partially evolved to reflect women's increasing participation in the workplace and politics: Work and Politics domains become more strongly female-stereotyped over the years. Yet, Home also became more female-stereotyped, suggesting that women were increasingly viewed as fulfilling multiple roles such as homemakers, workers, and politicians, rather than having one role replace another. Furthermore, the strength of female stereotype for occupations positively correlate with the proportion of women in each occupation, indicating that word-embedding-based measures of gender stereotype mirrored demographic shifts to a considerable extent.

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings

TL;DR

The paper develops a diachronic, corpus-based approach to quantify gender stereotypes in Japan from 1900 to 1999 using 100 year-specific word embeddings and a WEAT-inspired bias measure. It constructs domain and occupation targets, bootstraps stereotype estimates, and analyzes shifts in Home, Work, Politics, and 18 occupations, linking linguistic trends to demographic changes and Allied occupation reforms. The results show Work and Politics becoming more female-stereotyped after mid-century, Home remaining strongly female-associated, and occupational stereotypes gradually aligning with women’s labor-force participation, though most occupations remain male-stereotyped on average; a corpus-wide trend suggests broader linguistic shifts beyond the targeted domains. The study highlights partial reflection of real-world gender changes in language, while also revealing enduring cultural norms and methodological considerations when using historical embeddings to study social stereotypes.

Abstract

We quantify the evolution of gender stereotypes in Japan from 1900 to 1999 using a series of 100 word embeddings, each trained on a corpus from a specific year. We define the gender stereotype value to measure the strength of a word's gender association by computing the difference in cosine similarity of the word to female- versus male-related attribute words. We examine trajectories of gender stereotype across three traditionally gendered domains: Home, Work, and Politics, as well as occupations. The results indicate that language-based gender stereotypes partially evolved to reflect women's increasing participation in the workplace and politics: Work and Politics domains become more strongly female-stereotyped over the years. Yet, Home also became more female-stereotyped, suggesting that women were increasingly viewed as fulfilling multiple roles such as homemakers, workers, and politicians, rather than having one role replace another. Furthermore, the strength of female stereotype for occupations positively correlate with the proportion of women in each occupation, indicating that word-embedding-based measures of gender stereotype mirrored demographic shifts to a considerable extent.

Paper Structure

This paper contains 24 sections, 3 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: The total number of n-grams in each year in the NDL Ngram Data
  • Figure 2: The changes of female stereotypes of Home, Work, and Politics over the centry. The red vertical line corresponds to 1945, the year World War I I ended. We also include shaded regions representing the 95% confidence intervals around the stereotype estimates for each domain. However, the intervals are not visible in the plots because their width is negligible compared to the magnitude of the overall stereotype change.
  • Figure 3: The change of female stereotypes averaged across all 18 occupations over the centry. The red vertical line in the figure corresponds to 1945.
  • Figure 4: The overall correlation between the proportion of women in each occupation-year and the corresponding female stereotype value.
  • Figure 5: The changes of the averaged female stereotypes of all the words that appeared consistently across all 100 of our word embedding models. The red vertical line in the figure corresponds to 1945.
  • ...and 6 more figures