Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes
Juraj Vladika, Stephen Meisenbacher, Florian Matthes
TL;DR
This work reframes lexical substitution as producing contextually relevant substitutes rather than merely synonyms, and introduces ConCat, a simple method that concatenates the masked target sentence with the original sentence to infuse context before generation. Using RoBERTa-base and WordNet-based filtering, ConCat yields substitutions that better fit surrounding context while preserving sentence semantics, as demonstrated on LS07, CoInCo, and Swords benchmarks, with additional evaluation on downstream AG News tasks and a qualitative user survey. The authors also critique current LS benchmarks, showing inconsistencies and suggesting the inclusion of human judgments to assess contextual suitability. Overall, ConCat improves contextual relevance and semantic preservation, offering practical benefits for text rewriting tasks and downstream NLP applications, and the authors provide public code to foster adoption and further study.
Abstract
Lexical Substitution is the task of replacing a single word in a sentence with a similar one. This should ideally be one that is not necessarily only synonymous, but also fits well into the surrounding context of the target word, while preserving the sentence's grammatical structure. Recent advances in Lexical Substitution have leveraged the masked token prediction task of Pre-trained Language Models to generate replacements for a given word in a sentence. With this technique, we introduce ConCat, a simple augmented approach which utilizes the original sentence to bolster contextual information sent to the model. Compared to existing approaches, it proves to be very effective in guiding the model to make contextually relevant predictions for the target word. Our study includes a quantitative evaluation, measured via sentence similarity and task performance. In addition, we conduct a qualitative human analysis to validate that users prefer the substitutions proposed by our method, as opposed to previous methods. Finally, we test our approach on the prevailing benchmark for Lexical Substitution, CoInCo, revealing potential pitfalls of the benchmark. These insights serve as the foundation for a critical discussion on the way in which Lexical Substitution is evaluated.
