Table of Contents
Fetching ...

Logical Reasoning in Large Language Models: A Survey

Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang

TL;DR

This survey surveys the landscape of logical reasoning in large language models, categorizing reasoning into deductive, inductive, abductive, and analogical paradigms, and reviewing current tasks, benchmarks, and evaluation metrics. It maps enhancement strategies across data-centric, model-centric, external knowledge, and neuro-symbolic approaches, highlighting developments like expert-curated and synthetic datasets, RL-based planning, and hybrid architectures. The paper identifies key gaps in robustness, generalization, and interpretability, and argues for rigorous, multi-modal evaluation frameworks and scalable, hybrid reasoning systems. Its synthesis illuminates practical pathways for building more reliable, verifiable AI systems capable of structured logical inference in real-world domains.

Abstract

With the emergence of advanced reasoning models like OpenAI o3 and DeepSeek-R1, large language models (LLMs) have demonstrated remarkable reasoning capabilities. However, their ability to perform rigorous logical reasoning remains an open question. This survey synthesizes recent advancements in logical reasoning within LLMs, a critical area of AI research. It outlines the scope of logical reasoning in LLMs, its theoretical foundations, and the benchmarks used to evaluate reasoning proficiency. We analyze existing capabilities across different reasoning paradigms - deductive, inductive, abductive, and analogical - and assess strategies to enhance reasoning performance, including data-centric tuning, reinforcement learning, decoding strategies, and neuro-symbolic approaches. The review concludes with future directions, emphasizing the need for further exploration to strengthen logical reasoning in AI systems.

Logical Reasoning in Large Language Models: A Survey

TL;DR

This survey surveys the landscape of logical reasoning in large language models, categorizing reasoning into deductive, inductive, abductive, and analogical paradigms, and reviewing current tasks, benchmarks, and evaluation metrics. It maps enhancement strategies across data-centric, model-centric, external knowledge, and neuro-symbolic approaches, highlighting developments like expert-curated and synthetic datasets, RL-based planning, and hybrid architectures. The paper identifies key gaps in robustness, generalization, and interpretability, and argues for rigorous, multi-modal evaluation frameworks and scalable, hybrid reasoning systems. Its synthesis illuminates practical pathways for building more reliable, verifiable AI systems capable of structured logical inference in real-world domains.

Abstract

With the emergence of advanced reasoning models like OpenAI o3 and DeepSeek-R1, large language models (LLMs) have demonstrated remarkable reasoning capabilities. However, their ability to perform rigorous logical reasoning remains an open question. This survey synthesizes recent advancements in logical reasoning within LLMs, a critical area of AI research. It outlines the scope of logical reasoning in LLMs, its theoretical foundations, and the benchmarks used to evaluate reasoning proficiency. We analyze existing capabilities across different reasoning paradigms - deductive, inductive, abductive, and analogical - and assess strategies to enhance reasoning performance, including data-centric tuning, reinforcement learning, decoding strategies, and neuro-symbolic approaches. The review concludes with future directions, emphasizing the need for further exploration to strengthen logical reasoning in AI systems.

Paper Structure

This paper contains 36 sections, 4 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: The structure of this survey
  • Figure 2: Example tests of Logical reasoning in NLP tasks.