Analyzing Semantic Change through Lexical Replacements
Francesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi
TL;DR
This work addresses how semantic change disrupts contextualization in contextualized language models by introducing a replacement schema that substitutes a target word with related lexical replacements in fixed contexts. Using WordNet-based replacements spanning synonyms, antonyms, hypernyms, and random choices, the authors quantify the resulting tension via self-embedding distance ($SED$) across multiple Transformer models and establish an interpretable framework for detecting semantic change. They demonstrate that tension and change signals vary by part of speech and semantic relation, and they develop both synthetic and replacement-based evaluation pipelines, achieving competitive or superior results to state-of-the-art methods; notably, they show that LLaMa 2 can outperform conventional LSC setups in substitution tasks. The findings advance interpretable semantic-change detection and offer practical insights for diachronic linguistic analysis and cross-lingual, model-based investigations, with implications for robust diachronic NLP and historical corpora research.
Abstract
Modern language models are capable of contextualizing words based on their surrounding context. However, this capability is often compromised due to semantic change that leads to words being used in new, unexpected contexts not encountered during pre-training. In this paper, we model \textit{semantic change} by studying the effect of unexpected contexts introduced by \textit{lexical replacements}. We propose a \textit{replacement schema} where a target word is substituted with lexical replacements of varying relatedness, thus simulating different kinds of semantic change. Furthermore, we leverage the replacement schema as a basis for a novel \textit{interpretable} model for semantic change. We are also the first to evaluate the use of LLaMa for semantic change detection.
