A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection
Taichi Aida, Danushka Bollegala
TL;DR
This work tackles Lexical Semantic Change Detection (SCD) by proposing SDML, a supervised two-stage framework that first learns sense-aware word representations from WiC data and then learns a sense-aware distance metric to compare those representations across time. The sense-aware encoder captures meaning at the sense level, while the Mahalanobis-distance-based metric, learned via Information-Theoretic Metric Learning, quantifies semantic drift between time-separated corpora. Empirical results show SDML achieves state-of-the-art performance on multiple SCD benchmarks across several languages, with 2–5% gains over strong baselines, and reveals the existence of SCD-aware dimensions in sense embeddings. The study also analyzes how the learned metric exploits these dimensions and discusses limitations due to resource gaps in some languages, pointing to cross-lingual transfer as a promising direction for broader applicability.
Abstract
Detecting temporal semantic changes of words is an important task for various NLP applications that must make time-sensitive predictions. Lexical Semantic Change Detection (SCD) task involves predicting whether a given target word, $w$, changes its meaning between two different text corpora, $C_1$ and $C_2$. For this purpose, we propose a supervised two-staged SCD method that uses existing Word-in-Context (WiC) datasets. In the first stage, for a target word $w$, we learn two sense-aware encoders that represent the meaning of $w$ in a given sentence selected from a corpus. Next, in the second stage, we learn a sense-aware distance metric that compares the semantic representations of a target word across all of its occurrences in $C_1$ and $C_2$. Experimental results on multiple benchmark datasets for SCD show that our proposed method achieves strong performance in multiple languages. Additionally, our method achieves significant improvements on WiC benchmarks compared to a sense-aware encoder with conventional distance functions. Source code is available at https://github.com/LivNLP/svp-sdml .
