Table of Contents
Fetching ...

LLMs are Frequency Pattern Learners in Natural Language Inference

Liang Cheng, Zhaowei Wang, Mark Steedman

TL;DR

The paper investigates what LLMs learn during fine-tuning on NLI by revealing a pervasive frequency bias in predicate usage: hypotheses in positive entailments tend to contain higher-frequency predicates than premises. Through WordFreq-based predicate frequency analysis and a bias metric, the authors show that both standard and NLI-tuned LLMs exploit this bias during inference, and that fine-tuning amplifies reliance on such patterns. They partition test examples into bias-consistent and bias-adversarial subsets, demonstrating significant performance gaps in adversarial cases, which highlights a robustness vulnerability. A WordNet-based hyponym–hypernym analysis further links frequency bias to a generalization gradient from specific to general concepts, offering a mechanistic explanation for why frequency patterns can boost inferential capability while compromising robustness.

Abstract

While fine-tuning LLMs on NLI corpora improves their inferential performance, the underlying mechanisms driving this improvement remain largely opaque. In this work, we conduct a series of experiments to investigate what LLMs actually learn during fine-tuning. We begin by analyzing predicate frequencies in premises and hypotheses across NLI datasets and identify a consistent frequency bias, where predicates in hypotheses occur more frequently than those in premises for positive instances. To assess the impact of this bias, we evaluate both standard and NLI fine-tuned LLMs on bias-consistent and bias-adversarial cases. We find that LLMs exploit frequency bias for inference and perform poorly on adversarial instances. Furthermore, fine-tuned LLMs exhibit significantly increased reliance on this bias, suggesting that they are learning these frequency patterns from datasets. Finally, we compute the frequencies of hyponyms and their corresponding hypernyms from WordNet, revealing a correlation between frequency bias and textual entailment. These findings help explain why learning frequency patterns can enhance model performance on inference tasks.

LLMs are Frequency Pattern Learners in Natural Language Inference

TL;DR

The paper investigates what LLMs learn during fine-tuning on NLI by revealing a pervasive frequency bias in predicate usage: hypotheses in positive entailments tend to contain higher-frequency predicates than premises. Through WordFreq-based predicate frequency analysis and a bias metric, the authors show that both standard and NLI-tuned LLMs exploit this bias during inference, and that fine-tuning amplifies reliance on such patterns. They partition test examples into bias-consistent and bias-adversarial subsets, demonstrating significant performance gaps in adversarial cases, which highlights a robustness vulnerability. A WordNet-based hyponym–hypernym analysis further links frequency bias to a generalization gradient from specific to general concepts, offering a mechanistic explanation for why frequency patterns can boost inferential capability while compromising robustness.

Abstract

While fine-tuning LLMs on NLI corpora improves their inferential performance, the underlying mechanisms driving this improvement remain largely opaque. In this work, we conduct a series of experiments to investigate what LLMs actually learn during fine-tuning. We begin by analyzing predicate frequencies in premises and hypotheses across NLI datasets and identify a consistent frequency bias, where predicates in hypotheses occur more frequently than those in premises for positive instances. To assess the impact of this bias, we evaluate both standard and NLI fine-tuned LLMs on bias-consistent and bias-adversarial cases. We find that LLMs exploit frequency bias for inference and perform poorly on adversarial instances. Furthermore, fine-tuned LLMs exhibit significantly increased reliance on this bias, suggesting that they are learning these frequency patterns from datasets. Finally, we compute the frequencies of hyponyms and their corresponding hypernyms from WordNet, revealing a correlation between frequency bias and textual entailment. These findings help explain why learning frequency patterns can enhance model performance on inference tasks.

Paper Structure

This paper contains 27 sections, 1 equation, 1 figure, 9 tables.

Figures (1)

  • Figure 1: A sample in Levy/Holt$_{cons}$ and Levy/Holt$_{adv}$.