Computational Analysis of Gender Depiction in the Comedias of Calderón de la Barca
Allison Keith, Antonio Rojas Castro, Hanno Ehrlicher, Kerstin Jung, Sebastian Padó
TL;DR
This study tackles the problem of quantitatively analyzing gender depiction in Calderón de la Barca's comedias by training a BETO-based classifier on character speech and interpreting its decisions with Integrated Gradients. It introduces a principled framework for input granularity and aggregation (utterance, scene, character; none, majority, geometric mean) to produce robust, scene-aware predictions, achieving up to $F_1 = 0.83$ at the character level. The authors reveal qualitative gender cues—both lexical and semantic—such as domestic versus military spheres and relationship-focused language, and demonstrate that cross-dressing scenes can be detected and analyzed at fine granularity. The method contributes to digital humanities by enabling scalable, explainable analysis of gender portrayal across a large corpus, while highlighting data limitations and offering directions for extending to other traits, genres, and authors.
Abstract
In theatre, playwrights use the portrayal of characters to explore culturally based gender norms. In this paper, we develop quantitative methods to study gender depiction in the non-religious works (comedias) of Pedro Calderón de la Barca, a prolific Spanish 17th century author. We gather insights from a corpus of more than 100 plays by using a gender classifier and applying model explainability (attribution) methods to determine which text features are most influential in the model's decision to classify speech as 'male' or 'female', indicating the most gendered elements of dialogue in Calderón's comedias in a human accessible manner. We find that female and male characters are portrayed differently and can be identified by the gender prediction model at practically useful accuracies (up to f=0.83). Analysis reveals semantic aspects of gender portrayal, and demonstrates that the model is even useful in providing a relatively accurate scene-by-scene prediction of cross-dressing characters.
