Neutralizing Bias in LLM Reasoning using Entailment Graphs

Liang Cheng; Tianyi Li; Zhaowei Wang; Tianyang Liu; Mark Steedman

Neutralizing Bias in LLM Reasoning using Entailment Graphs

Liang Cheng, Tianyi Li, Zhaowei Wang, Tianyang Liu, Mark Steedman

TL;DR

This work tackles attestation bias in LLM-based Natural Language Inference (NLI), where models over-rely on memorized hypotheses rather than premises. It introduces an unsupervised pipeline that builds Entailment Graphs (EGs) from open-domain corpora, instantiates these graphs with typed entities to generate counterfactual NLI data, and fine-tunes LLMs using LoRA on this data. The approach yields significant reductions in attestation bias across multiple models and improves inferential performance, with especially strong gains for smaller models and robust improvements on bias-neutralized test sets. The bias-neutralized evaluation framework further enables a fair assessment of true reasoning capability, advancing robust NLI reasoning in practical settings and offering a pathway to broader task generalization in future work.

Abstract

LLMs are often claimed to be capable of Natural Language Inference (NLI), which is widely regarded as a cornerstone of more complex forms of reasoning. However, recent works show that LLMs still suffer from hallucinations in NLI due to attestation bias, where LLMs overly rely on propositional memory to build shortcuts. To solve the issue, we design an unsupervised framework to construct counterfactual reasoning data and fine-tune LLMs to reduce attestation bias. To measure bias reduction, we build bias-adversarial variants of NLI datasets with randomly replaced predicates in premises while keeping hypotheses unchanged. Extensive evaluations show that our framework can significantly reduce hallucinations from attestation bias. Then, we further evaluate LLMs fine-tuned with our framework on original NLI datasets and their bias-neutralized versions, where original entities are replaced with randomly sampled ones. Extensive results show that our framework consistently improves inferential performance on both original and bias-neutralized NLI datasets.

Neutralizing Bias in LLM Reasoning using Entailment Graphs

TL;DR

Abstract

Neutralizing Bias in LLM Reasoning using Entailment Graphs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)