XL-DURel: Finetuning Sentence Transformers for Ordinal Word-in-Context Classification
Sachin Yadav, Dominik Schlechtweg
TL;DR
Word-in-Context tasks span binary, ordinal, and graded similarity formulations. XL-DURel finetunes a multilingual Sentence Transformer with ranking-oriented losses (AnglE, CoSENT) and target-word marking to exploit ordinal structure in OGWiC data, while training on a merged mix of ordinal CoMeDi and binary WiC datasets. Ranking losses, especially AnglE, achieve top performance on ordinal evaluation and competitive results on binary WiC, with mixing ordinal and binary data further boosting performance. The work showcases a unified, efficient framework for multilingual WiC modeling and provides a practical embedder for word meaning in context, with implications for broader downstream language understanding tasks.
Abstract
We propose XL-DURel, a finetuned, multilingual Sentence Transformer model optimized for ordinal Word-in-Context classification. We test several loss functions for regression and ranking tasks managing to outperform previous models on ordinal and binary data with a ranking objective based on angular distance in complex space. We further show that binary WiC can be treated as a special case of ordinal WiC and that optimizing models for the general ordinal task improves performance on the more specific binary task. This paves the way for a unified treatment of WiC modeling across different task formulations.
