Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection

Abhinav Lalwani; Tasha Kim; Lovish Chopra; Christopher Hahn; Zhijing Jin; Mrinmaya Sachan

Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection

Abhinav Lalwani, Tasha Kim, Lovish Chopra, Christopher Hahn, Zhijing Jin, Mrinmaya Sachan

TL;DR

The paper tackles translating natural language to formal reasoning to enable reliable verification. It introduces NL2FOL, a three-module, neurosymbolic pipeline that converts NL to $FOL$, compiles to SMT inputs, and uses a solver like $CVC4$ to assess validity, with SMT countermodels translated back to NL. Contributions include background-knowledge grounding, step-by-step intermediate natural-language outputs for interpretability, and strong cross-domain performance on Logic and LogicClimate without fine-tuning. The work demonstrates a principled bridge between NL understanding and formal reasoning, with practical impact on misinformation detection and knowledge validation.

Abstract

Translating natural language into formal language such as First-Order Logic (FOL) is a foundational challenge in NLP with wide-ranging applications in automated reasoning, misinformation tracking, and knowledge validation. In this paper, we introduce Natural Language to First-Order Logic (NL2FOL), a framework to autoformalize natural language to FOL step by step using Large Language Models (LLMs). Our approach addresses key challenges in this translation process, including the integration of implicit background knowledge. By leveraging structured representations generated by NL2FOL, we use Satisfiability Modulo Theory (SMT) solvers to reason about the logical validity of natural language statements. We present logical fallacy detection as a case study to evaluate the efficacy of NL2FOL. Being neurosymbolic, our approach also provides interpretable insights into the reasoning process and demonstrates robustness without requiring model fine-tuning or labeled training data. Our framework achieves strong performance on multiple datasets. On the LOGIC dataset, NL2FOL achieves an F1-score of 78%, while generalizing effectively to the LOGICCLIMATE dataset with an F1-score of 80%.

Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection

TL;DR

The paper tackles translating natural language to formal reasoning to enable reliable verification. It introduces NL2FOL, a three-module, neurosymbolic pipeline that converts NL to

, compiles to SMT inputs, and uses a solver like

to assess validity, with SMT countermodels translated back to NL. Contributions include background-knowledge grounding, step-by-step intermediate natural-language outputs for interpretability, and strong cross-domain performance on Logic and LogicClimate without fine-tuning. The work demonstrates a principled bridge between NL understanding and formal reasoning, with practical impact on misinformation detection and knowledge validation.

Abstract

Paper Structure (14 sections, 3 equations, 2 figures, 6 tables, 2 algorithms)

This paper contains 14 sections, 3 equations, 2 figures, 6 tables, 2 algorithms.

Introduction
Related Work
Methodology
Experiments
Results and Discussion
Quantitative Analysis
Qualitative Analysis
Success Modes of NL2FOL
Failure Modes of NL2FOL
Conclusion
Algorithms
Prompt Examples
End-to-end LLM Prompts
Intermediate NL2FOL Prompts

Figures (2)

Figure 1: Overview of the proposed framework used for logical fallacy detection. Module A converts natural language input to a first-order logic formula merged with contextual relationships, Module B compiles the negation of a given logical formula to an SMT file with well-defined sorts for variables and predicates, and Module C runs CVC on the SMT file and if the negation is satisfiable, interprets the counter-model in natural language.
Figure 2: Example of logical fallacy detection using NL2FOL. The resulting classification is explained using a counterexample generated by the SMT solver.

Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection

TL;DR

Abstract

Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (2)