BERT in Plutarch's Shadows

Ivan P. Yamshchikov; Alexey Tikhonov; Yorgos Pantis; Charlotte Schubert; Jürgen Jost

BERT in Plutarch's Shadows

Ivan P. Yamshchikov, Alexey Tikhonov, Yorgos Pantis, Charlotte Schubert, Jürgen Jost

TL;DR

This work develops an Ancient Greek BERT via transfer learning to tackle authorship attribution for Plutarch-related texts, particularly Pseudo-Plutarch. By combining MLM-based pretraining from related Greek models with a large, sentence-balanced downstream classifier, it achieves about $80\%$ validation accuracy and demonstrates regional signals that help contextualize authorship. Applying the model to De Fluviis, De Musica, and Placita Philosophorum suggests distinct authorial and regional profiles, with Placita most plausibly linked to an Alexandrian scientific milieu of the 2nd century CE rather than to Plutarch. The study highlights the feasibility of low-resource language modeling for classical texts and provides a data-driven, philology-friendly framework for future investigations into ancient authorship and reception.

Abstract

The extensive surviving corpus of the ancient scholar Plutarch of Chaeronea (ca. 45-120 CE) also contains several texts which, according to current scholarly opinion, did not originate with him and are therefore attributed to an anonymous author Pseudo-Plutarch. These include, in particular, the work Placita Philosophorum (Quotations and Opinions of the Ancient Philosophers), which is extremely important for the history of ancient philosophy. Little is known about the identity of that anonymous author and its relation to other authors from the same period. This paper presents a BERT language model for Ancient Greek. The model discovers previously unknown statistical properties relevant to these literary, philosophical, and historical problems and can shed new light on this authorship question. In particular, the Placita Philosophorum, together with one of the other Pseudo-Plutarch texts, shows similarities with the texts written by authors from an Alexandrian context (2nd/3rd century CE).

BERT in Plutarch's Shadows

TL;DR

validation accuracy and demonstrates regional signals that help contextualize authorship. Applying the model to De Fluviis, De Musica, and Placita Philosophorum suggests distinct authorial and regional profiles, with Placita most plausibly linked to an Alexandrian scientific milieu of the 2nd century CE rather than to Plutarch. The study highlights the feasibility of low-resource language modeling for classical texts and provides a data-driven, philology-friendly framework for future investigations into ancient authorship and reception.

Abstract

Paper Structure (12 sections, 2 figures, 6 tables)

This paper contains 12 sections, 2 figures, 6 tables.

Introduction
Data
Ancient Greek BERT
Tokenizers
Training Ancient Greek BERT via Transfer Learning
Authorship Attribution with Ancient Greek BERT
Regional Attribution with Ancient Greek BERT
Classifying Pseudo-Plutarch
Discussion
Conclusion
Acknowledgments
Appendix

Figures (2)

Figure 1: A map showing relative position on three potential regions relevant for authorship attribution of Pseudo-Plutarch documents.
Figure 2: All three Pseudo-Plurach documents show significantly different percentages of sentences attributed to a certain region. In particular, Placita Philosophorum is the only document where Delphi is not a dominant region, while Alexandria is the most frequent identifiable region.

BERT in Plutarch's Shadows

TL;DR

Abstract

BERT in Plutarch's Shadows

TL;DR

Abstract

Table of Contents

Figures (2)