Table of Contents
Fetching ...

Follow My Lead: Logical Fallacy Classification with Knowledge-Augmented LLMs

Olivia Peiyu Wang, Tashvi Bansal, Ryan Bai, Emily M. Chui, Leilani H. Gilpin

TL;DR

This work addresses the challenge of logical fallacy detection in large language models by introducing a knowledge-augmented, stepwise approach. It combines an Atomic-Instruction-Dataset-for-Logical-Fallacies (AID-LF) with three-tier hierarchical classification and a Prolog-based relational graph to constrain decision space and enable reconsideration of initial judgments. Across multiple state-of-the-art models, the combined stepwise instructions and relational graphs yield the strongest performance (up to ~63% accuracy), demonstrating improved transparency and robustness over baseline and hierarchical methods. The study highlights the potential of neuro-symbolic strategies for mitigating reasoning deficits in LLMs while outlining limitations and clear directions for future work, including broader model generalization and richer knowledge representations.

Abstract

Large Language Models (LLMs) suffer from critical reasoning gaps, including a tendency to hallucinate and poor accuracy in classifying logical fallacies. This limitation stems from their default System 1 processing, which is fast and intuitive, whereas reliable reasoning requires the deliberate, effortful System 2 approach (Kahneman, 2011; Li et al., 2025). Since full System 2 training is often prohibitively expensive, we explore a low-cost, instruction-based intervention to bridge this gap. Our methodology introduces a novel stepwise instruction dataset that decomposes fallacy classification into a series of atomic procedural steps (simple binary questions). We further augment this with a final verification step where models consult a relational knowledge graph of related fallacies. This procedural, rule-based intervention yields a significant improvement in LLM logical fallacy classification. Crucially, the approach also provides enhanced transparency into the LLMs' decision-making, highlighting a practical pathway for Neuro-symbolic architectures to address LLM reasoning deficits.

Follow My Lead: Logical Fallacy Classification with Knowledge-Augmented LLMs

TL;DR

This work addresses the challenge of logical fallacy detection in large language models by introducing a knowledge-augmented, stepwise approach. It combines an Atomic-Instruction-Dataset-for-Logical-Fallacies (AID-LF) with three-tier hierarchical classification and a Prolog-based relational graph to constrain decision space and enable reconsideration of initial judgments. Across multiple state-of-the-art models, the combined stepwise instructions and relational graphs yield the strongest performance (up to ~63% accuracy), demonstrating improved transparency and robustness over baseline and hierarchical methods. The study highlights the potential of neuro-symbolic strategies for mitigating reasoning deficits in LLMs while outlining limitations and clear directions for future work, including broader model generalization and richer knowledge representations.

Abstract

Large Language Models (LLMs) suffer from critical reasoning gaps, including a tendency to hallucinate and poor accuracy in classifying logical fallacies. This limitation stems from their default System 1 processing, which is fast and intuitive, whereas reliable reasoning requires the deliberate, effortful System 2 approach (Kahneman, 2011; Li et al., 2025). Since full System 2 training is often prohibitively expensive, we explore a low-cost, instruction-based intervention to bridge this gap. Our methodology introduces a novel stepwise instruction dataset that decomposes fallacy classification into a series of atomic procedural steps (simple binary questions). We further augment this with a final verification step where models consult a relational knowledge graph of related fallacies. This procedural, rule-based intervention yields a significant improvement in LLM logical fallacy classification. Crucially, the approach also provides enhanced transparency into the LLMs' decision-making, highlighting a practical pathway for Neuro-symbolic architectures to address LLM reasoning deficits.

Paper Structure

This paper contains 28 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Visualization of Stepwise Instructions Dataset Pipeline. This figure illustrates the two-phase pipeline used to create the Atomic-Instruction-Dataset-for-Logical-Fallacies (AID-LF). Phase 1 involved using large language models (LLMs) to decompose the descriptions and logical forms of 232 fallacies into atomic, actionable steps. Phase 2 consisted of a rigorous human annotation process, where authors performed three rounds of verification and cross-checking. The annotations ensured that the instructions accurately reflected the original fallacy definitions, minimized redundancy, and provided complete coverage.
  • Figure 2: Baseline Establishment Process. For the baseline, large language models (LLMs) were provided with the descriptions and logical forms of all 232 fallacies, along with a statement for classification. The LLMs were instructed to select the most appropriate fallacy.
  • Figure 3: Three-tiered Hierarchical Classification of Logical Fallacy. The hierarchical classification process proceeded in three distinct levels. Level 1: The LLMs were initially prompted to classify a statement as either a Formal or Informal Fallacy, using only the definitions of these two categories. Level 2: Based on the first-level classification, the models were then supplied with definitions for the corresponding subcategories (e.g., "Proposition," "Quantification," "Syllogism" for formal fallacies, or "Ambiguity," "Inconsistency," "Irrelevance," "Insufficiency," "Inappropriate Presumption" for informal fallacies). The LLMs were given the option to revise their initial classification. Level 3: For the final, most granular classification, the models were prompted to select the specific fallacy from a detailed list that included each fallacy's description and logical form. At this stage, they could also revise any prior decisions.
  • Figure 4: Stepwise Instruction Classification Process. The LLMs were supplied with the stepwise instruction dataset and the statement to be classified. They were instructed to execute the steps for each fallacy and return the first fallacy for which all steps matched the ground truth.
  • Figure 5: Instruction-Guided Classification with Relational Graphs
  • ...and 2 more figures