Table of Contents
Fetching ...

Computational Analysis of Gender Depiction in the Comedias of Calderón de la Barca

Allison Keith, Antonio Rojas Castro, Hanno Ehrlicher, Kerstin Jung, Sebastian Padó

TL;DR

This study tackles the problem of quantitatively analyzing gender depiction in Calderón de la Barca's comedias by training a BETO-based classifier on character speech and interpreting its decisions with Integrated Gradients. It introduces a principled framework for input granularity and aggregation (utterance, scene, character; none, majority, geometric mean) to produce robust, scene-aware predictions, achieving up to $F_1 = 0.83$ at the character level. The authors reveal qualitative gender cues—both lexical and semantic—such as domestic versus military spheres and relationship-focused language, and demonstrate that cross-dressing scenes can be detected and analyzed at fine granularity. The method contributes to digital humanities by enabling scalable, explainable analysis of gender portrayal across a large corpus, while highlighting data limitations and offering directions for extending to other traits, genres, and authors.

Abstract

In theatre, playwrights use the portrayal of characters to explore culturally based gender norms. In this paper, we develop quantitative methods to study gender depiction in the non-religious works (comedias) of Pedro Calderón de la Barca, a prolific Spanish 17th century author. We gather insights from a corpus of more than 100 plays by using a gender classifier and applying model explainability (attribution) methods to determine which text features are most influential in the model's decision to classify speech as 'male' or 'female', indicating the most gendered elements of dialogue in Calderón's comedias in a human accessible manner. We find that female and male characters are portrayed differently and can be identified by the gender prediction model at practically useful accuracies (up to f=0.83). Analysis reveals semantic aspects of gender portrayal, and demonstrates that the model is even useful in providing a relatively accurate scene-by-scene prediction of cross-dressing characters.

Computational Analysis of Gender Depiction in the Comedias of Calderón de la Barca

TL;DR

This study tackles the problem of quantitatively analyzing gender depiction in Calderón de la Barca's comedias by training a BETO-based classifier on character speech and interpreting its decisions with Integrated Gradients. It introduces a principled framework for input granularity and aggregation (utterance, scene, character; none, majority, geometric mean) to produce robust, scene-aware predictions, achieving up to at the character level. The authors reveal qualitative gender cues—both lexical and semantic—such as domestic versus military spheres and relationship-focused language, and demonstrate that cross-dressing scenes can be detected and analyzed at fine granularity. The method contributes to digital humanities by enabling scalable, explainable analysis of gender portrayal across a large corpus, while highlighting data limitations and offering directions for extending to other traits, genres, and authors.

Abstract

In theatre, playwrights use the portrayal of characters to explore culturally based gender norms. In this paper, we develop quantitative methods to study gender depiction in the non-religious works (comedias) of Pedro Calderón de la Barca, a prolific Spanish 17th century author. We gather insights from a corpus of more than 100 plays by using a gender classifier and applying model explainability (attribution) methods to determine which text features are most influential in the model's decision to classify speech as 'male' or 'female', indicating the most gendered elements of dialogue in Calderón's comedias in a human accessible manner. We find that female and male characters are portrayed differently and can be identified by the gender prediction model at practically useful accuracies (up to f=0.83). Analysis reveals semantic aspects of gender portrayal, and demonstrates that the model is even useful in providing a relatively accurate scene-by-scene prediction of cross-dressing characters.

Paper Structure

This paper contains 26 sections, 1 equation, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Gender classification for a character at different levels of granularity (blue = masculine, red = feminine). Saturation indicates model confidence.
  • Figure 2: Visualization of the attribution model: blue indicates male, orange female; saturation corresponds to strength of cue. Original passage (La vida es sueño, Rosaura, Act 1 Scene 1): Stay on this mountain, where the brutes have their Phaeton, and I, with no other path than the one given to me by the laws of destiny, blind and desperate, will lower my tangled head from this eminent mountain where the sun wrinkles the frown on my forehead.
  • Figure 3: Experiment 2: Probability that the character is predicted as female (prediction made at our first input level, all lines) for all female characters speaking more than 2000 words (green dot = predicted as female, red dot = predicted as male, green box = cross-dressing). Note: The highlighted instance of Semíramis is from la Hija del Aire II.
  • Figure 4: Experiment 2: Analysis of cross-dressing characters by scene. Cross-dressing scenes are indicated by blue bars. y axis represents the probability that the character will be predicted as female.