A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection

Taichi Aida; Danushka Bollegala

A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection

Taichi Aida, Danushka Bollegala

TL;DR

This work tackles Lexical Semantic Change Detection (SCD) by proposing SDML, a supervised two-stage framework that first learns sense-aware word representations from WiC data and then learns a sense-aware distance metric to compare those representations across time. The sense-aware encoder captures meaning at the sense level, while the Mahalanobis-distance-based metric, learned via Information-Theoretic Metric Learning, quantifies semantic drift between time-separated corpora. Empirical results show SDML achieves state-of-the-art performance on multiple SCD benchmarks across several languages, with 2–5% gains over strong baselines, and reveals the existence of SCD-aware dimensions in sense embeddings. The study also analyzes how the learned metric exploits these dimensions and discusses limitations due to resource gaps in some languages, pointing to cross-lingual transfer as a promising direction for broader applicability.

Abstract

Detecting temporal semantic changes of words is an important task for various NLP applications that must make time-sensitive predictions. Lexical Semantic Change Detection (SCD) task involves predicting whether a given target word, $w$, changes its meaning between two different text corpora, $C_1$ and $C_2$. For this purpose, we propose a supervised two-staged SCD method that uses existing Word-in-Context (WiC) datasets. In the first stage, for a target word $w$, we learn two sense-aware encoders that represent the meaning of $w$ in a given sentence selected from a corpus. Next, in the second stage, we learn a sense-aware distance metric that compares the semantic representations of a target word across all of its occurrences in $C_1$ and $C_2$. Experimental results on multiple benchmark datasets for SCD show that our proposed method achieves strong performance in multiple languages. Additionally, our method achieves significant improvements on WiC benchmarks compared to a sense-aware encoder with conventional distance functions. Source code is available at https://github.com/LivNLP/svp-sdml .

A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection

TL;DR

Abstract

, changes its meaning between two different text corpora,

and

. For this purpose, we propose a supervised two-staged SCD method that uses existing Word-in-Context (WiC) datasets. In the first stage, for a target word

, we learn two sense-aware encoders that represent the meaning of

in a given sentence selected from a corpus. Next, in the second stage, we learn a sense-aware distance metric that compares the semantic representations of a target word across all of its occurrences in

and

. Experimental results on multiple benchmark datasets for SCD show that our proposed method achieves strong performance in multiple languages. Additionally, our method achieves significant improvements on WiC benchmarks compared to a sense-aware encoder with conventional distance functions. Source code is available at https://github.com/LivNLP/svp-sdml .

Paper Structure (19 sections, 10 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 19 sections, 10 equations, 8 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Semantic Distance Metric Learning
Learning Sense-Aware Encoder
Learning Sense-Aware Distance Metrics
Measuring Temporal Semantic Change
Experiments
Setting
Evaluating Semantic Changes of Words
Baselines:
Sense-Aware Methods:
Discussion
SCD-Aware Dimensions Exist in the Sense-Aware Embedding Space
Elements of the Sense-Aware Distance Metric
Conclusion
...and 4 more sections

Figures (8)

Figure 1: The existence of SCD-aware dimensions (left-five columns) and SCD-unaware dimensions (right-five columns) in the word representations. On the $y$-axis, the target words are sorted in descending order by the human-annotated semantic change scores (i.e. target words at the top have their meaning changed over time). For each dimension, the average pairwise (absolute) distance between the two timestamps is calculated. After that, the dimensions are sorted by the absolute value of the Spearman's rank correlation. Finally, we plot the pairwise distances of the target words for the top/bottom five dimensions with the highest correlations. (More details are in \ref{['subsec:discussion_scd_aware_dim']}.)
Figure 2: The existence of SCD-aware dimensions in the SemEval De. The top five correlations are 0.874, 0.872, 0.868, 0.858, and 0.848, and the bottom five correlations are 0.413, 0.411, 0.396, 0.381, and 0.374.
Figure 3: The common number of dimensions between each corresponding pair of classes for multiple dimensions. Dimensions are sorted by the order of SCD awareness ($x$-axis, Top-25% to Bottom-25%) and compared with the learned Mahalanobis matrices sorted by the importance for each SCD dataset ($y$-axis).
Figure 4: The existence of SCD-aware dimensions (left-five columns) and SCD-unaware dimensions (right-five columns) in the SemEval Sv. The top five correlations are 0.735, 0.726, 0.720, 0.701, and 0.700, and the bottom five correlations are 0.144, 0.129, 0.116, 0.081, and 0.052.
Figure 5: The existence of SCD-aware dimensions (left-five columns) and SCD-unaware dimensions (right-five columns) in the SemEval La. The top five correlations are 0.471, 0.449, 0.448, 0.442, and 0.419, and the bottom five correlations are $-$0.001, $-$0.001, 0.000, 0.000, and 0.000.
...and 3 more figures

A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection

TL;DR

Abstract

A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (8)