The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change
Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte im Walde
TL;DR
DURel introduces an online, open-source tool that combines human and computational annotation to measure semantic proximity between word uses and infer word senses via Word Usage Graphs (WUGs) and correlation clustering. The system integrates WiC-based computational annotators, standardized training, and visualization to support lexicography and diachronic sense change analysis. Case studies on an arm-related dataset and Swedish headwords demonstrate the approach's practical utility and its ability to flag outdated or novel senses. Overall, DURel provides a focused, end-to-end workflow for sense clustering and semantic variation analysis, with an emphasis on inter-annotator agreement and scalable annotation generation for large corpora.
Abstract
We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface. The tool supports standardized human annotation as well as computational annotation, building on recent advances with Word-in-Context models. Annotator judgments are clustered with automatic graph clustering techniques and visualized for analysis. This allows to measure word senses with simple and intuitive micro-task judgments between use pairs, requiring minimal preparation efforts. The tool offers additional functionalities to compare the agreement between annotators to guarantee the inter-subjectivity of the obtained judgments and to calculate summary statistics giving insights into sense frequency distributions, semantic variation or changes of senses over time.
