Follow My Lead: Logical Fallacy Classification with Knowledge-Augmented LLMs
Olivia Peiyu Wang, Tashvi Bansal, Ryan Bai, Emily M. Chui, Leilani H. Gilpin
TL;DR
This work addresses the challenge of logical fallacy detection in large language models by introducing a knowledge-augmented, stepwise approach. It combines an Atomic-Instruction-Dataset-for-Logical-Fallacies (AID-LF) with three-tier hierarchical classification and a Prolog-based relational graph to constrain decision space and enable reconsideration of initial judgments. Across multiple state-of-the-art models, the combined stepwise instructions and relational graphs yield the strongest performance (up to ~63% accuracy), demonstrating improved transparency and robustness over baseline and hierarchical methods. The study highlights the potential of neuro-symbolic strategies for mitigating reasoning deficits in LLMs while outlining limitations and clear directions for future work, including broader model generalization and richer knowledge representations.
Abstract
Large Language Models (LLMs) suffer from critical reasoning gaps, including a tendency to hallucinate and poor accuracy in classifying logical fallacies. This limitation stems from their default System 1 processing, which is fast and intuitive, whereas reliable reasoning requires the deliberate, effortful System 2 approach (Kahneman, 2011; Li et al., 2025). Since full System 2 training is often prohibitively expensive, we explore a low-cost, instruction-based intervention to bridge this gap. Our methodology introduces a novel stepwise instruction dataset that decomposes fallacy classification into a series of atomic procedural steps (simple binary questions). We further augment this with a final verification step where models consult a relational knowledge graph of related fallacies. This procedural, rule-based intervention yields a significant improvement in LLM logical fallacy classification. Crucially, the approach also provides enhanced transparency into the LLMs' decision-making, highlighting a practical pathway for Neuro-symbolic architectures to address LLM reasoning deficits.
