Table of Contents
Fetching ...

Computational Analysis of Character Development in Holocaust Testimonies

Esther Shizgal, Eitan Wagner, Renana Keydar, Omri Abend

TL;DR

This paper tackles the problem of character development in narratives by analyzing religious trajectories in Holocaust survivor testimonies at scale. It introduces a pipeline that segments narratives, filters religious content, and uses RoBERTa for content detection followed by GPT-4.1 and Mistral-7B-based trajectory generation, then clusters trajectories via predefined taxonomy and unsupervised DTW-based methods. Key contributions include a corpus-driven study of belief and practice trajectories across 1,000 transcripts, empirical IAA and DTW-based validation, and insights into common trajectory structures such as oscillating practice and constant-positive belief. The work demonstrates the potential of NLP and large language models to illuminate thematic trajectories in historical narratives, while acknowledging biases and methodological limitations, and it lays groundwork for broader sociological interpretation and ethical data handling.

Abstract

This work presents a computational approach to analyze character development along the narrative timeline. The analysis characterizes the inner and outer changes the protagonist undergoes within a narrative, and the interplay between them. We consider transcripts of Holocaust survivor testimonies as a test case, each telling the story of an individual in first-person terms. We focus on the survivor's religious trajectory, examining the evolution of their disposition toward religious belief and practice along the testimony. Clustering the resulting trajectories in the dataset, we identify common sequences in the data. Our findings highlight multiple common structures of religiosity across the narratives: in terms of belief, most present a constant disposition, while for practice, most present an oscillating structure, serving as valuable material for historical and sociological research. This work demonstrates the potential of natural language processing techniques for analyzing character evolution through thematic trajectories in narratives.

Computational Analysis of Character Development in Holocaust Testimonies

TL;DR

This paper tackles the problem of character development in narratives by analyzing religious trajectories in Holocaust survivor testimonies at scale. It introduces a pipeline that segments narratives, filters religious content, and uses RoBERTa for content detection followed by GPT-4.1 and Mistral-7B-based trajectory generation, then clusters trajectories via predefined taxonomy and unsupervised DTW-based methods. Key contributions include a corpus-driven study of belief and practice trajectories across 1,000 transcripts, empirical IAA and DTW-based validation, and insights into common trajectory structures such as oscillating practice and constant-positive belief. The work demonstrates the potential of NLP and large language models to illuminate thematic trajectories in historical narratives, while acknowledging biases and methodological limitations, and it lays groundwork for broader sociological interpretation and ethical data handling.

Abstract

This work presents a computational approach to analyze character development along the narrative timeline. The analysis characterizes the inner and outer changes the protagonist undergoes within a narrative, and the interplay between them. We consider transcripts of Holocaust survivor testimonies as a test case, each telling the story of an individual in first-person terms. We focus on the survivor's religious trajectory, examining the evolution of their disposition toward religious belief and practice along the testimony. Clustering the resulting trajectories in the dataset, we identify common sequences in the data. Our findings highlight multiple common structures of religiosity across the narratives: in terms of belief, most present a constant disposition, while for practice, most present an oscillating structure, serving as valuable material for historical and sociological research. This work demonstrates the potential of natural language processing techniques for analyzing character evolution through thematic trajectories in narratives.

Paper Structure

This paper contains 48 sections, 1 equation, 23 figures, 10 tables.

Figures (23)

  • Figure 1: Our pipeline for identifying the religious trajectories from a set of Holocaust survivor testimonies: (1) Segmentation: segment the testimonies by question-answer pairs; (2) Filtration: train and run a classifier to filter all segments containing religious content; (3) Determining Valence: use LLMs to identify the protagonist's valence of religious practice and/or beliefs in a given segment; (4) Schematization: cluster the resulting trajectories to identify common patterns of evolution of religiosity in the testimony dataset.
  • Figure 2: An annotation example from the platform we provided the annotators with to identify the survivor's valence of religious practice and belief in each segment.
  • Figure 3: Alignment of predicted trajectories with reference trajectories for testimony ID: 45091, illustrating the differences and overlaps between them. The colored rectangle widths correspond to the segment lengths. The x-axis represents the normalized position within the testimony transcript.
  • Figure 4: Religious Trajectory structure distributions, from left to right: 68% of the practice trajectories have an oscillating structure and 25% are constant-active. The belief trajectories have a similar distribution, with the majority of trajectories sharing a constant-positive structure (45%) and the rest distributed among oscillating (26%), ascending (12%), constant-negative (9%), and descending (8%). For the intersection of the two aspects, the two large groups cover 44% of the intersection, all have an oscillating practice structure, while the belief valence distributes evenly between oscillating and constant-positive structures.
  • Figure 5: Distribution of all religious content, the distributions that we based the baselines on are according to each label separately.
  • ...and 18 more figures