Language Models can be Logical Solvers
Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen
TL;DR
LoGiPT directly imitates the reasoning process of logical solvers by training an open-source LM on solver-derived symbolic traces, thus bypassing NL-to-SL parsing errors that hamper traditional solver-augmented approaches. The authors construct a four-turn instruction-tuning dataset by revealing solver traces and fine-tune open-source LLMs to act as deductive solvers, outputting all implied facts given premises. Empirically, LoGiPT outperforms state-of-the-art solver-augmented LMs on ProofWriter and PrOntoQA, and in several setups matches or exceeds GPT-4 prompting, illustrating the viability of distilling symbolic solver reasoning into 13B-parameter models. The work introduces a practical, data-driven pipeline for encoding solver logic into neural models and highlights the importance of symbolic representations and carefully designed reasoning formats for robust deductive reasoning.
Abstract
Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.
