Table of Contents
Fetching ...

Improving the Natural Language Inference robustness to hard dataset by data augmentation and preprocessing

Zijiang Yang

TL;DR

This work tackles the robustness challenges of natural language inference (NLI) models when faced with hard, out-of-distribution data. It proposes three general methods—word overlap augmentation, numerical reasoning augmentation, and length-mismatch preprocessing (Split algorithm)—to improve inference beyond surface-level pattern memorization. Using ELECTRA-small trained on SNLI, the paper demonstrates significant gains on hard datasets such as HANS ($ ext{≈}14 ext%$) and ANLI ($ ext{≈}3$–$7 ext%$ depending on configuration), with only a modest reduction on SNLI (≈$3 ext%$) and with as few as $1000$ augmented samples. The findings suggest that lightweight, distribution-agnostic augmentation can substantially enhance out-of-distribution generalization for NLI, offering practical improvements with modest data overhead.

Abstract

Natural Language Inference (NLI) is the task of inferring whether the hypothesis can be justified by the given premise. Basically, we classify the hypothesis into three labels(entailment, neutrality and contradiction) given the premise. NLI was well studied by the previous researchers. A number of models, especially the transformer based ones, have achieved significant improvement on these tasks. However, it is reported that these models are suffering when they are dealing with hard datasets. Particularly, they perform much worse when dealing with unseen out-of-distribution premise and hypothesis. They may not understand the semantic content but learn the spurious correlations. In this work, we propose the data augmentation and preprocessing methods to solve the word overlap, numerical reasoning and length mismatch problems. These methods are general methods that do not rely on the distribution of the testing data and they help improve the robustness of the models.

Improving the Natural Language Inference robustness to hard dataset by data augmentation and preprocessing

TL;DR

This work tackles the robustness challenges of natural language inference (NLI) models when faced with hard, out-of-distribution data. It proposes three general methods—word overlap augmentation, numerical reasoning augmentation, and length-mismatch preprocessing (Split algorithm)—to improve inference beyond surface-level pattern memorization. Using ELECTRA-small trained on SNLI, the paper demonstrates significant gains on hard datasets such as HANS () and ANLI ( depending on configuration), with only a modest reduction on SNLI (≈) and with as few as augmented samples. The findings suggest that lightweight, distribution-agnostic augmentation can substantially enhance out-of-distribution generalization for NLI, offering practical improvements with modest data overhead.

Abstract

Natural Language Inference (NLI) is the task of inferring whether the hypothesis can be justified by the given premise. Basically, we classify the hypothesis into three labels(entailment, neutrality and contradiction) given the premise. NLI was well studied by the previous researchers. A number of models, especially the transformer based ones, have achieved significant improvement on these tasks. However, it is reported that these models are suffering when they are dealing with hard datasets. Particularly, they perform much worse when dealing with unseen out-of-distribution premise and hypothesis. They may not understand the semantic content but learn the spurious correlations. In this work, we propose the data augmentation and preprocessing methods to solve the word overlap, numerical reasoning and length mismatch problems. These methods are general methods that do not rely on the distribution of the testing data and they help improve the robustness of the models.

Paper Structure

This paper contains 11 sections, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure :
  • Figure :
  • Figure :
  • Figure :
  • Figure :
  • ...and 1 more figures