Table of Contents
Fetching ...

Language Models can be Logical Solvers

Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen

TL;DR

LoGiPT directly imitates the reasoning process of logical solvers by training an open-source LM on solver-derived symbolic traces, thus bypassing NL-to-SL parsing errors that hamper traditional solver-augmented approaches. The authors construct a four-turn instruction-tuning dataset by revealing solver traces and fine-tune open-source LLMs to act as deductive solvers, outputting all implied facts given premises. Empirically, LoGiPT outperforms state-of-the-art solver-augmented LMs on ProofWriter and PrOntoQA, and in several setups matches or exceeds GPT-4 prompting, illustrating the viability of distilling symbolic solver reasoning into 13B-parameter models. The work introduces a practical, data-driven pipeline for encoding solver logic into neural models and highlights the importance of symbolic representations and carefully designed reasoning formats for robust deductive reasoning.

Abstract

Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.

Language Models can be Logical Solvers

TL;DR

LoGiPT directly imitates the reasoning process of logical solvers by training an open-source LM on solver-derived symbolic traces, thus bypassing NL-to-SL parsing errors that hamper traditional solver-augmented approaches. The authors construct a four-turn instruction-tuning dataset by revealing solver traces and fine-tune open-source LLMs to act as deductive solvers, outputting all implied facts given premises. Empirically, LoGiPT outperforms state-of-the-art solver-augmented LMs on ProofWriter and PrOntoQA, and in several setups matches or exceeds GPT-4 prompting, illustrating the viability of distilling symbolic solver reasoning into 13B-parameter models. The work introduces a practical, data-driven pipeline for encoding solver logic into neural models and highlights the importance of symbolic representations and carefully designed reasoning formats for robust deductive reasoning.

Abstract

Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.
Paper Structure (34 sections, 4 figures, 4 tables)

This paper contains 34 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Data flow of current solver-augmented LMs for inference (a), and our pipeline for LoGiPT(b,c).
  • Figure 2: A deductive reasoning question derived from ProofWriter and its parsed Facts, Rules, and Query.
  • Figure 3: A comprehensive 4-turn training example of our instruction-tuning data. We highlight the initial occurrences of each functionality described in §\ref{['sec:reveal']} using the corresponding colors. We omit some predicates and Facts in Turn-2 due to limited space. Hint: this figure is color-sensitive.
  • Figure 4: The full version of the comprehensive 4-turn training example of our instruction-tuning data shown in Figure \ref{['fig:ExampleSimplfied']}.