Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time

Marisa Hudspeth; Brendan O'Connor; Laure Thompson

Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time

Marisa Hudspeth, Brendan O'Connor, Laure Thompson

TL;DR

This study addresses cross-time and cross-genre morphological tagging for Latin by consolidating five UD treebanks with LASLA through automated harmonization and a standard Latin grammar conversion. It introduces time- and genre-aware metadata, constructs cross-time data splits, and demonstrates that a LatinBERT-based tagger with separate feature heads achieves state-of-the-art performance while showing robust cross-domain behavior. Key contributions include a detailed annotation agreement analysis, a standardized tagset aligned with traditional Latin grammar, and insightful cross-time experiments revealing where data harmonization helps or hurts. The work lays groundwork for more reliable Latin NLP across historical varieties and informs dataset curation, model design, and evaluation strategies in digital humanities contexts.

Abstract

Existing Latin treebanks draw from Latin's long written tradition, spanning 17 centuries and a variety of cultures. Recent efforts have begun to harmonize these treebanks' annotations to better train and evaluate morphological taggers. However, the heterogeneity of these treebanks must be carefully considered to build effective and reliable data. In this work, we review existing Latin treebanks to identify the texts they draw from, identify their overlap, and document their coverage across time and genre. We additionally design automated conversions of their morphological feature annotations into the conventions of standard Latin grammar. From this, we build new time-period data splits that draw from the existing treebanks which we use to perform a broad cross-time analysis for POS and morphological feature tagging. We find that BERT-based taggers outperform existing taggers while also being more robust to cross-domain shifts.

Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time

TL;DR

Abstract

Paper Structure (34 sections, 4 figures, 13 tables)

This paper contains 34 sections, 4 figures, 13 tables.

Introduction
Latin Treebanks Revisited
Time and Genre Metadata
Genre.
Time.
UD Treebanks
LASLA: Additional Classical-era treebank
Harmonizing UD and LASLA Annotations
Annotation Agreement Between UD and LASLA
Our Harmonization Efforts
Remaining inconsistencies we've harmonized.
Collapsing feature values.
Conversion to Standard Latin Grammar
Remaining Inconsistencies
Our Custom Data Splits
...and 19 more sections

Figures (4)

Figure 1: From our curated metadata (§\ref{['sec:data']}), the number of sentences per century (3rd BCE---14th CE) across the 5 UD treebanks and LASLA, shown with three broad time periods.
Figure 2: Number of sentences in the UD treebanks per century, colored by genre.
Figure 3: Example of how a token's set of morphological features changes after standardization, from Cicero’s Letters to Atticus Book 3 Letter 9.
Figure 4: Example of an error in the model's prediction due to acontextual ambiguity, from Cicero’s Letters to Atticus Book 3 Letter 9.

Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time

TL;DR

Abstract

Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time

Authors

TL;DR

Abstract

Table of Contents

Figures (4)