Hallucinate Less by Thinking More: Aspect-Based Causal Abstention for Large Language Models
Vy Nguyen, Ziqi Xu, Jeffrey Chan, Estrid He, Feng Xia, Xiuzhen Zhang
TL;DR
ABCA addresses hallucinations in large language models by enabling pre-generation abstention through aspect-conditioned causal inference. It formalizes a two-stage framework where Stage 1 discovers interpretable aspects and Stage 2 estimates aspect-conditioned causal effects using Augmented Inverse Probability Weighting, yielding P(A | do(Q), X) via P(c | Q, X) P(A | c, Q, X). An abstention policy based on Centroid Angular Deviation detects knowledge conflicts (Type-1) or insufficiency (Type-2) and uses aggregation when evidence is consistent, improving reliability and interpretability. Experiments on TruthfulQA, KUQ, AVeriTeC, and AbstainQA show state-of-the-art abstention performance across backbones, with ABCA balancing answering accuracy and abstention quality. This work demonstrates that leveraging internal knowledge diversity through causal reasoning can robustly mitigate hallucinations in practical LLM deployments.
Abstract
Large Language Models (LLMs) often produce fluent but factually incorrect responses, a phenomenon known as hallucination. Abstention, where the model chooses not to answer and instead outputs phrases such as "I don't know", is a common safeguard. However, existing abstention methods typically rely on post-generation signals, such as generation variations or feedback, which limits their ability to prevent unreliable responses in advance. In this paper, we introduce Aspect-Based Causal Abstention (ABCA), a new framework that enables early abstention by analysing the internal diversity of LLM knowledge through causal inference. This diversity reflects the multifaceted nature of parametric knowledge acquired from various sources, representing diverse aspects such as disciplines, legal contexts, or temporal frames. ABCA estimates causal effects conditioned on these aspects to assess the reliability of knowledge relevant to a given query. Based on these estimates, we enable two types of abstention: Type-1, where aspect effects are inconsistent (knowledge conflict), and Type-2, where aspect effects consistently support abstention (knowledge insufficiency). Experiments on standard benchmarks demonstrate that ABCA improves abstention reliability, achieves state-of-the-art performance, and enhances the interpretability of abstention decisions.
