Table of Contents
Fetching ...

Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Thought

Xiaoyu Tan, Yongxin Deng, Xihe Qiu, Weidi Xu, Chao Qu, Wei Chu, Yinghui Xu, Yuan Qi

TL;DR

Thought-Like-Pro presents a self-driven framework that leverages a Prolog-based symbolic engine to verify reasoning trajectories and generate CoT-like demonstrations for imitation learning in LLMs. By deriving and translating verified Prolog trajectories into natural language CoT, the approach trains models to imitate strictly logical reasoning and employs model averaging to mitigate catastrophic forgetting, achieving strong performance on logic-intensive benchmarks and improved OOD generalization. The method is practical, reproducible with open-source LLMs, and demonstrates a promising integration of symbolic and neural reasoning for enhanced general reasoning capabilities. These findings suggest a viable path toward more reliable, generalizable reasoning in AI systems with potential industrial impact.

Abstract

Large language models (LLMs) have shown exceptional performance as general-purpose assistants, excelling across a variety of reasoning tasks. This achievement represents a significant step toward achieving artificial general intelligence (AGI). Despite these advancements, the effectiveness of LLMs often hinges on the specific prompting strategies employed, and there remains a lack of a robust framework to facilitate learning and generalization across diverse reasoning tasks. To address these challenges, we introduce a novel learning framework, THOUGHT-LIKE-PRO In this framework, we utilize imitation learning to imitate the Chain-of-Thought (CoT) process which is verified and translated from reasoning trajectories generated by a symbolic Prolog logic engine. This framework proceeds in a self-driven manner, that enables LLMs to formulate rules and statements from given instructions and leverage the symbolic Prolog engine to derive results. Subsequently, LLMs convert Prolog-derived successive reasoning trajectories into natural language CoT for imitation learning. Our empirical findings indicate that our proposed approach substantially enhances the reasoning abilities of LLMs and demonstrates robust generalization across out-of-distribution reasoning tasks.

Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Thought

TL;DR

Thought-Like-Pro presents a self-driven framework that leverages a Prolog-based symbolic engine to verify reasoning trajectories and generate CoT-like demonstrations for imitation learning in LLMs. By deriving and translating verified Prolog trajectories into natural language CoT, the approach trains models to imitate strictly logical reasoning and employs model averaging to mitigate catastrophic forgetting, achieving strong performance on logic-intensive benchmarks and improved OOD generalization. The method is practical, reproducible with open-source LLMs, and demonstrates a promising integration of symbolic and neural reasoning for enhanced general reasoning capabilities. These findings suggest a viable path toward more reliable, generalizable reasoning in AI systems with potential industrial impact.

Abstract

Large language models (LLMs) have shown exceptional performance as general-purpose assistants, excelling across a variety of reasoning tasks. This achievement represents a significant step toward achieving artificial general intelligence (AGI). Despite these advancements, the effectiveness of LLMs often hinges on the specific prompting strategies employed, and there remains a lack of a robust framework to facilitate learning and generalization across diverse reasoning tasks. To address these challenges, we introduce a novel learning framework, THOUGHT-LIKE-PRO In this framework, we utilize imitation learning to imitate the Chain-of-Thought (CoT) process which is verified and translated from reasoning trajectories generated by a symbolic Prolog logic engine. This framework proceeds in a self-driven manner, that enables LLMs to formulate rules and statements from given instructions and leverage the symbolic Prolog engine to derive results. Subsequently, LLMs convert Prolog-derived successive reasoning trajectories into natural language CoT for imitation learning. Our empirical findings indicate that our proposed approach substantially enhances the reasoning abilities of LLMs and demonstrates robust generalization across out-of-distribution reasoning tasks.
Paper Structure (13 sections, 6 equations, 1 figure, 1 table)