Making Language Models Robust Against Negation
MohammadHossein Rezaei, Eduardo Blanco
TL;DR
This paper addresses the persistent challenge that language models face when processing negation. It introduces two self-supervised pre-training tasks—Next Sentence Polarity Prediction (NSPP) and a polarity-reversed NSP variant—that enable models to better reason about negation without requiring labeled data. Across a broad suite of benchmarks, including CondaQA and several NLI/NLU datasets plus LAMA/LAMA-Neg, models pre-trained on these tasks show consistent improvements in negation robustness, with NSP generally delivering stronger gains than NSPP. While joint training on both tasks can be beneficial in some settings, it does not universally improve performance, highlighting nuanced interactions between tasks and model sizes. Overall, the approach enhances negation reasoning while maintaining competitive performance on inputs without negation, offering a scalable, language-agnostic pathway to more reliable NLP systems in real-world contexts.
Abstract
Negation has been a long-standing challenge for language models. Previous studies have shown that they struggle with negation in many natural language understanding tasks. In this work, we propose a self-supervised method to make language models more robust against negation. We introduce a novel task, Next Sentence Polarity Prediction (NSPP), and a variation of the Next Sentence Prediction (NSP) task. We show that BERT and RoBERTa further pre-trained on our tasks outperform the off-the-shelf versions on nine negation-related benchmarks. Most notably, our pre-training tasks yield between 1.8% and 9.1% improvement on CondaQA, a large question-answering corpus requiring reasoning over negation.
