Logical Reasoning in Large Language Models: A Survey
Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang
TL;DR
This survey surveys the landscape of logical reasoning in large language models, categorizing reasoning into deductive, inductive, abductive, and analogical paradigms, and reviewing current tasks, benchmarks, and evaluation metrics. It maps enhancement strategies across data-centric, model-centric, external knowledge, and neuro-symbolic approaches, highlighting developments like expert-curated and synthetic datasets, RL-based planning, and hybrid architectures. The paper identifies key gaps in robustness, generalization, and interpretability, and argues for rigorous, multi-modal evaluation frameworks and scalable, hybrid reasoning systems. Its synthesis illuminates practical pathways for building more reliable, verifiable AI systems capable of structured logical inference in real-world domains.
Abstract
With the emergence of advanced reasoning models like OpenAI o3 and DeepSeek-R1, large language models (LLMs) have demonstrated remarkable reasoning capabilities. However, their ability to perform rigorous logical reasoning remains an open question. This survey synthesizes recent advancements in logical reasoning within LLMs, a critical area of AI research. It outlines the scope of logical reasoning in LLMs, its theoretical foundations, and the benchmarks used to evaluate reasoning proficiency. We analyze existing capabilities across different reasoning paradigms - deductive, inductive, abductive, and analogical - and assess strategies to enhance reasoning performance, including data-centric tuning, reinforcement learning, decoding strategies, and neuro-symbolic approaches. The review concludes with future directions, emphasizing the need for further exploration to strengthen logical reasoning in AI systems.
