Table of Contents
Fetching ...

SANDWiCH: Semantical Analysis of Neighbours for Disambiguating Words in Context ad Hoc

Daniel Guzman-Olivares, Lara Quijano-Sanchez, Federico Liberatore

TL;DR

Word Sense Disambiguation remains challenging for out-of-distribution and low-resource contexts. SANDWiCH reframes WSD as cluster discrimination over a BabelNet-refined semantic graph and employs POS-aware cross-encoders to form and score sense clusters. It achieves state-of-the-art results on English all-words benchmarks and nine languages while using roughly 28% of the parameters of prior methods, demonstrating strong generalization and robustness, especially for rare senses. This approach offers a scalable, efficient path toward robust multilingual WSD with practical impact on downstream NLP tasks.

Abstract

The rise of generative chat-based Large Language Models (LLMs) over the past two years has spurred a race to develop systems that promise near-human conversational and reasoning experiences. However, recent studies indicate that the language understanding offered by these models remains limited and far from human-like performance, particularly in grasping the contextual meanings of words, an essential aspect of reasoning. In this paper, we present a simple yet computationally efficient framework for multilingual Word Sense Disambiguation (WSD). Our approach reframes the WSD task as a cluster discrimination analysis over a semantic network refined from BabelNet using group algebra. We validate our methodology across multiple WSD benchmarks, achieving a new state of the art for all languages and tasks, as well as in individual assessments by part of speech. Notably, our model significantly surpasses the performance of current alternatives, even in low-resource languages, while reducing the parameter count by 72%.

SANDWiCH: Semantical Analysis of Neighbours for Disambiguating Words in Context ad Hoc

TL;DR

Word Sense Disambiguation remains challenging for out-of-distribution and low-resource contexts. SANDWiCH reframes WSD as cluster discrimination over a BabelNet-refined semantic graph and employs POS-aware cross-encoders to form and score sense clusters. It achieves state-of-the-art results on English all-words benchmarks and nine languages while using roughly 28% of the parameters of prior methods, demonstrating strong generalization and robustness, especially for rare senses. This approach offers a scalable, efficient path toward robust multilingual WSD with practical impact on downstream NLP tasks.

Abstract

The rise of generative chat-based Large Language Models (LLMs) over the past two years has spurred a race to develop systems that promise near-human conversational and reasoning experiences. However, recent studies indicate that the language understanding offered by these models remains limited and far from human-like performance, particularly in grasping the contextual meanings of words, an essential aspect of reasoning. In this paper, we present a simple yet computationally efficient framework for multilingual Word Sense Disambiguation (WSD). Our approach reframes the WSD task as a cluster discrimination analysis over a semantic network refined from BabelNet using group algebra. We validate our methodology across multiple WSD benchmarks, achieving a new state of the art for all languages and tasks, as well as in individual assessments by part of speech. Notably, our model significantly surpasses the performance of current alternatives, even in low-resource languages, while reducing the parameter count by 72%.

Paper Structure

This paper contains 19 sections, 4 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Illustration of the SANDWiCH architecture in the processing of the word bank in context.
  • Figure 2: Error rate difference between ConSec (in salmon) and SANDWiCH (in light blue) for words with different number of glosses.