Table of Contents
Fetching ...

Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin

Gleb Schmidt, Svetlana Gorovaia, Ivan P. Yamshchikov

TL;DR

Evaluating the performance of Large Language Models in authorship attribu- tion and authorship verification tasks for Latin texts of the Patristic Era shows that LLMs can be robust in zero-shot author- ship verification even on short texts without sophisticated feature engineering.

Abstract

This paper evaluates the performance of Large Language Models (LLMs) in authorship attribution and authorship verification tasks for Latin texts of the Patristic Era. The study showcases that LLMs can be robust in zero-shot authorship verification even on short texts without sophisticated feature engineering. Yet, the models can also be easily "mislead" by semantics. The experiments also demonstrate that steering the model's authorship analysis and decision-making is challenging, unlike what is reported in the studies dealing with high-resource modern languages. Although LLMs prove to be able to beat, under certain circumstances, the traditional baselines, obtaining a nuanced and truly explainable decision requires at best a lot of experimentation.

Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin

TL;DR

Evaluating the performance of Large Language Models in authorship attribu- tion and authorship verification tasks for Latin texts of the Patristic Era shows that LLMs can be robust in zero-shot author- ship verification even on short texts without sophisticated feature engineering.

Abstract

This paper evaluates the performance of Large Language Models (LLMs) in authorship attribution and authorship verification tasks for Latin texts of the Patristic Era. The study showcases that LLMs can be robust in zero-shot authorship verification even on short texts without sophisticated feature engineering. Yet, the models can also be easily "mislead" by semantics. The experiments also demonstrate that steering the model's authorship analysis and decision-making is challenging, unlike what is reported in the studies dealing with high-resource modern languages. Although LLMs prove to be able to beat, under certain circumstances, the traditional baselines, obtaining a nuanced and truly explainable decision requires at best a lot of experimentation.

Paper Structure

This paper contains 17 sections, 1 figure, 7 tables.

Figures (1)

  • Figure 1: Cosine Similarity Correlation Heatmap by Model and Prompt