LOGIC-LM++: Multi-Step Refinement for Symbolic Formulations
Shashank Kirtania, Priyanshu Gupta, Arjun Radhakirshna
TL;DR
The paper addresses the semantic weaknesses of LLM-based symbolic reasoning by introducing Logic-LM++, which augments the Logic-LM framework with pairwise comparison-based semantic checks and richer refinement context. It adds a Self-Refinement Agent to focus refinements on the problem statement and a Backtracking Agent to prune non-improving edits, aiming for semantically correct symbolic formulations. Evaluations on FOLIO, AR-LSAT, and ProofWriter show substantial improvements over baselines, including notable gains in execution accuracy and consistency across prompting regimes. The work demonstrates the potential to generalize semantic refinement to tool-augmented reasoning, while acknowledging limitations with initial formulations and smaller LLMs affecting semantic capture.
Abstract
In this paper we examine the limitations of Large Language Models (LLMs) for complex reasoning tasks. Although recent works have started to employ formal languages as an intermediate representation for reasoning tasks, they often face challenges in accurately generating and refining these formal specifications to ensure correctness. To address these issues, this paper proposes Logic-LM++, an improvement on Logic-LM . It uses the ability of LLMs to do pairwise comparisons, allowing the evaluation of the refinements suggested by the LLM. The paper demonstrates that Logic-LM++ outperforms Logic-LM and other contemporary techniques across natural language reasoning tasks on three datasets, FOLIO, ProofWriter and AR-LSAT, with an average improvement of 18.5% on standard prompting, 12.3% on chain of thought prompting and 5% on Logic-LM.
