Adaptive Selection of Symbolic Languages for Improving LLM Logical Reasoning
Xiangyu Wang, Haocheng Yang, Fengxiang Cheng, Fenrong Liu
TL;DR
The paper tackles the problem that solver-based LLM reasoning is highly sensitive to how NL problems are translated into a symbolic language. It proposes an adaptive approach that selects the most suitable symbolic language among FOL, LP, and SAT for each problem, then translates NL to that SL and solves using an external solver. Experiments on ProntoQA, ProofWriter, and LogicalDeduction show that adaptive SL selection substantially outperforms fixed-SL and random baselines, achieving up to 96% Overall-Acc on a mixed dataset. The work demonstrates that dynamic SL choice can significantly enhance NL-to-SL translation accuracy and final reasoning accuracy, and suggests extending the set of SLs and developing a formal theory to guide SL selection.
Abstract
Large Language Models (LLMs) still struggle with complex logical reasoning. While previous works achieve remarkable improvements, their performance is highly dependent on the correctness of translating natural language (NL) problems into a symbolic language (SL). Though numerous works focusing on improving this translation accuracy, they only consider the similarity between the meaning of SL and NL, overlooking another crucial influencing factor, the selection of the target SL type itself. For example, first-order logic language specializes in logical reasoning with categorical syllogisms and complex quantifiers, while Boolean satisfiability formalism excels at representing constraint satisfaction like partial problems. To our knowledge, this is the first paper to claim and verify that different NL logical reasoning problem corresponds to different optimal SL formalization for translation. Based on this, we propose a methods to improve the logical reasoning performance of LLMs by adaptively selecting the most suitable SL for each problem prior to translation. Specifically, we leverage LLMs to select the target SL among first-order logic, logic programming and Boolean satisfiability and then translate the problem in NL to target SL expressions as well as employ the corresponding logical solver to derive the final answer. Experimental results on benchmarks show that our adaptive selection method significantly outperforms translating all into single SL and randomly selecting the SL. On a mixed dataset of these benchmarks, our approach achieves 96% accuracy, which improving performance by 25% compared to the second highest accuracy from the first-order logic translation.
