Table of Contents
Fetching ...

Rethinking Metrics for Lexical Semantic Change Detection

Roksana Goworek, Haim Dubossarsky

TL;DR

Lexical semantic change detection (LSCD) has relied heavily on global distribution metrics like APD and PRT, which can overlook localized shifts in usage. The authors propose Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), two usage-level metrics based on local cross-temporal correspondence, and evaluate them across languages, encoders, and representation spaces. Across broad experiments, AMD and SAMD prove more robust than traditional metrics, especially under dimensionality reduction, with AMD benefiting from a definition-based space and SAMD excelling with specialized encoders. The work argues for a complementary LSCD toolkit that combines local correspondence metrics with interpretable spaces to improve discovery, robustness, and interpretability in contextualised embedding-based change analysis.

Abstract

Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.

Rethinking Metrics for Lexical Semantic Change Detection

TL;DR

Lexical semantic change detection (LSCD) has relied heavily on global distribution metrics like APD and PRT, which can overlook localized shifts in usage. The authors propose Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), two usage-level metrics based on local cross-temporal correspondence, and evaluate them across languages, encoders, and representation spaces. Across broad experiments, AMD and SAMD prove more robust than traditional metrics, especially under dimensionality reduction, with AMD benefiting from a definition-based space and SAMD excelling with specialized encoders. The work argues for a complementary LSCD toolkit that combines local correspondence metrics with interpretable spaces to improve discovery, robustness, and interpretability in contextualised embedding-based change analysis.

Abstract

Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.
Paper Structure (35 sections, 11 equations, 5 figures, 4 tables)

This paper contains 35 sections, 11 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Illustrative schematic contrasting LSCD distance measures on usage embeddings from two corpora (blue: $C_1$, yellow: $C_2$).
  • Figure 2: Performance (Spearman correlations) across metrics and spaces, averaged over languages and encoders. Std in brackets.
  • Figure 3: Performance of APD, PRT, AMD and SAMD using XL-LEXEME embeddings across representation spaces. Each row corresponds to a language; columns show metric–space combinations.
  • Figure 4: Performance of APD, PRT, AMD and SAMD for non-specialised encoders.
  • Figure 5: Stress test of metrics under progressive dimensionality reduction. Spearman correlations are averaged across languages and encoders as embedding dimensionality is reduced by factors of two using PCA or random dimension selection.