Table of Contents
Fetching ...

Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining

Shuqi Liu, Bowei He, Linqi Song

TL;DR

Bi-Chainer tackles the challenge of complex, multi-step logical reasoning in large language models by introducing bidirectional chaining that dynamically switches between forward and backward reasoning in a depth-first manner. It integrates six LLM-driven modules to identify facts, select rules, deduce or abduct, check facts, and monitor confusion, using guidance from the opposite reasoning direction to reduce branching and inference calls. Across four challenging datasets (ProofWriter, FOLIO, AR-LSAT, ParaRules), Bi-Chainer achieves sizable improvements in label accuracy and, crucially, near-perfect proof accuracy (up to 98%), while significantly reducing the number of LLM calls compared with forward-only or backward-only baselines. The approach demonstrates stronger reasoning reliability and efficiency, with additional validation on open-source models and a detailed case study illustrating how bidirectional guidance mitigates premise confusion. These findings suggest bidirectional, depth-first reasoning as a practical path to more capable and efficient automated reasoning in LLMs.

Abstract

Large Language Models (LLMs) have shown human-like reasoning abilities but still face challenges in solving complex logical problems. Existing unidirectional chaining methods, such as forward chaining and backward chaining, suffer from issues like low prediction accuracy and efficiency. To address these, we propose a bidirectional chaining method, Bi-Chainer, which dynamically switches to depth-first reasoning in the opposite reasoning direction when it encounters multiple branching options within the current direction. Thus, the intermediate reasoning results can be utilized as guidance to facilitate the reasoning process. We show that Bi-Chainer achieves sizable accuracy boots over unidirectional chaining frameworks on four challenging logical reasoning datasets. Moreover, Bi-Chainer enhances the accuracy of intermediate proof steps and reduces the average number of inference calls, resulting in more efficient and accurate reasoning.

Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining

TL;DR

Bi-Chainer tackles the challenge of complex, multi-step logical reasoning in large language models by introducing bidirectional chaining that dynamically switches between forward and backward reasoning in a depth-first manner. It integrates six LLM-driven modules to identify facts, select rules, deduce or abduct, check facts, and monitor confusion, using guidance from the opposite reasoning direction to reduce branching and inference calls. Across four challenging datasets (ProofWriter, FOLIO, AR-LSAT, ParaRules), Bi-Chainer achieves sizable improvements in label accuracy and, crucially, near-perfect proof accuracy (up to 98%), while significantly reducing the number of LLM calls compared with forward-only or backward-only baselines. The approach demonstrates stronger reasoning reliability and efficiency, with additional validation on open-source models and a detailed case study illustrating how bidirectional guidance mitigates premise confusion. These findings suggest bidirectional, depth-first reasoning as a practical path to more capable and efficient automated reasoning in LLMs.

Abstract

Large Language Models (LLMs) have shown human-like reasoning abilities but still face challenges in solving complex logical problems. Existing unidirectional chaining methods, such as forward chaining and backward chaining, suffer from issues like low prediction accuracy and efficiency. To address these, we propose a bidirectional chaining method, Bi-Chainer, which dynamically switches to depth-first reasoning in the opposite reasoning direction when it encounters multiple branching options within the current direction. Thus, the intermediate reasoning results can be utilized as guidance to facilitate the reasoning process. We show that Bi-Chainer achieves sizable accuracy boots over unidirectional chaining frameworks on four challenging logical reasoning datasets. Moreover, Bi-Chainer enhances the accuracy of intermediate proof steps and reduces the average number of inference calls, resulting in more efficient and accurate reasoning.
Paper Structure (31 sections, 5 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Bi-Chainer framework in bidirectional chaining (c) in comparison with the Selection-Inference framework in forward chaining (a) and the LAMBADA framework in backward chaining (b).
  • Figure 2: Label prediction accuracies on (a)-(b) ProofWriter, (c) FOLIO, (d) AR-LSAT, and (e) ParaRules datasets.
  • Figure 3: (a) Proof accuracy results on ProofWriter-PUD (Depth-5) for a set of randomly sampled examples for which the models correctly predicted the goal. (b) Precision and Recall results for Premise Selection on the selected samples from the ProofWriter-PUD (Depth-5), with shaded areas indicating the performance gap between different reasoning frameworks for the Proved, Disproved, and Unknown cases.
  • Figure 4: Comparing SI, LAMBADA with Bi-Chainer w.r.t. the average number of inference calls they make per example in different datasets.
  • Figure 5: Confusion matrices.