Quantitative Intertextuality from the Digital Humanities Perspective: A Survey
Siyu Duan
TL;DR
This survey maps the emergence of quantitative intertextuality in digital humanities, detailing data sources across Chinese, English, Latin, and other languages, and outlining a full computational pipeline from data preprocessing to intertextual network analysis. It covers a spectrum of intertextual features (overlap, vector, stylistic, metadata, sequence labeling), methods for parallel intertextual detection (text matching, thresholds, evaluation metrics, acceleration), and normalization strategies essential for cross-text comparison. The paper also groups applications into influence, preference, similarity, writing habits, and cultural evolution, and reviews platforms and tools that support humanities research in this domain. Finally, it discusses current challenges—such as benchmark scarcity and interpretability—and envisions future directions in cross-lingual and multimodal intertextuality, emphasizing the field’s potential to deepen interdisciplinary insights at scale.
Abstract
The connection between texts is referred to as intertextuality in literary theory, which served as an important theoretical basis in many digital humanities studies. Over the past decade, advancements in natural language processing have ushered intertextuality studies into the quantitative age. Large-scale intertextuality research based on cutting-edge methods has continuously emerged. This paper provides a roadmap for quantitative intertextuality studies, summarizing their data, methods, and applications. Drawing on data from multiple languages and topics, this survey reviews methods from statistics to deep learning. It also summarizes their applications in humanities and social sciences research and the associated platform tools. Driven by advances in computer technology, more precise, diverse, and large-scale intertext studies can be anticipated. Intertextuality holds promise for broader application in interdisciplinary research bridging AI and the humanities.
