Explicit Inductive Inference using Large Language Models
Tianyang Liu, Tianyi Li, Liang Cheng, Mark Steedman
TL;DR
This work addresses the problem that Large Language Models (LLMs) exhibit attestation bias when inferring whether a premise P entails a hypothesis H, often tying predictions to H's out-of-context truth rather than its conditional truth given P. The authors introduce Explicit Inductive Inference (EIDI), a pipeline that converts P into multiple attested alternatives P' by replacing arguments with type-preserving entities, derives corresponding hypotheses H', and aggregates the LLMs' predictions on these derived inquiries to produce an explicit inductive score for the original query. Experiments on the Levy/Holt directional predicate entailment dataset with GPT-3.5 and Llama3 show that EIDI markedly improves overall inference performance and reduces sensitivity to attestation bias, especially when aggregating a larger set of transformed inferences. Importantly, EIDI relies solely on the LLMs' own generation without external knowledge, offering a practical approach to bias-aware reasoning in downstream tasks such as KG completion and question answering. The work also discusses limitations, including computational cost and interaction with frequency biases, suggesting directions for future refinements in bias-robust reasoning frameworks.
Abstract
Large Language Models (LLMs) are reported to hold undesirable attestation bias on inference tasks: when asked to predict if a premise P entails a hypothesis H, instead of considering H's conditional truthfulness entailed by P, LLMs tend to use the out-of-context truth label of H as a fragile proxy. In this paper, we propose a pipeline that exploits this bias to do explicit inductive inference. Our pipeline uses an LLM to transform a premise into a set of attested alternatives, and then aggregate answers of the derived new entailment inquiries to support the original inference prediction. On a directional predicate entailment benchmark, we demonstrate that by applying this simple pipeline, we can improve the overall performance of LLMs on inference and substantially alleviate the impact of their attestation bias.
