Analyzing Semantic Change through Lexical Replacements

Francesco Periti; Pierluigi Cassotti; Haim Dubossarsky; Nina Tahmasebi

Analyzing Semantic Change through Lexical Replacements

Francesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi

TL;DR

This work addresses how semantic change disrupts contextualization in contextualized language models by introducing a replacement schema that substitutes a target word with related lexical replacements in fixed contexts. Using WordNet-based replacements spanning synonyms, antonyms, hypernyms, and random choices, the authors quantify the resulting tension via self-embedding distance ($SED$) across multiple Transformer models and establish an interpretable framework for detecting semantic change. They demonstrate that tension and change signals vary by part of speech and semantic relation, and they develop both synthetic and replacement-based evaluation pipelines, achieving competitive or superior results to state-of-the-art methods; notably, they show that LLaMa 2 can outperform conventional LSC setups in substitution tasks. The findings advance interpretable semantic-change detection and offer practical insights for diachronic linguistic analysis and cross-lingual, model-based investigations, with implications for robust diachronic NLP and historical corpora research.

Abstract

Modern language models are capable of contextualizing words based on their surrounding context. However, this capability is often compromised due to semantic change that leads to words being used in new, unexpected contexts not encountered during pre-training. In this paper, we model \textit{semantic change} by studying the effect of unexpected contexts introduced by \textit{lexical replacements}. We propose a \textit{replacement schema} where a target word is substituted with lexical replacements of varying relatedness, thus simulating different kinds of semantic change. Furthermore, we leverage the replacement schema as a basis for a novel \textit{interpretable} model for semantic change. We are also the first to evaluate the use of LLaMa for semantic change detection.

Analyzing Semantic Change through Lexical Replacements

TL;DR

) across multiple Transformer models and establish an interpretable framework for detecting semantic change. They demonstrate that tension and change signals vary by part of speech and semantic relation, and they develop both synthetic and replacement-based evaluation pipelines, achieving competitive or superior results to state-of-the-art methods; notably, they show that LLaMa 2 can outperform conventional LSC setups in substitution tasks. The findings advance interpretable semantic-change detection and offer practical insights for diachronic linguistic analysis and cross-lingual, model-based investigations, with implications for robust diachronic NLP and historical corpora research.

Abstract

Paper Structure (21 sections, 3 equations, 5 figures, 6 tables)

This paper contains 21 sections, 3 equations, 5 figures, 6 tables.

Introduction
Our contributions:
Related Work
Modern contextualized LMs
Lexical Semantic Change (LSC)
Methodology
The replacement schema
Data
Experimental setup
Tension caused by semantic change
Self-embedding distance
Semantic change
LCS through synthetic dataset
LSC through replacements
Random replacements
...and 6 more sections

Figures (5)

Figure 1: Average SED over layers.
Figure 2: Spearman Correlation over layers for artificial semantic change.
Figure 3: Top-k replacement vs Spearman Correlation.
Figure 4: PRT and JSD performance on the artificial LSC dataset
Figure 5: PRT and JSD performance on the artificial LSC dataset

Analyzing Semantic Change through Lexical Replacements

TL;DR

Abstract

Analyzing Semantic Change through Lexical Replacements

Authors

TL;DR

Abstract

Table of Contents

Figures (5)