Table of Contents
Fetching ...

GraphMind: Theorem Selection and Conclusion Generation Framework with Dynamic GNN for LLM Reasoning

Yutong Li, Yitian Zhou, Xudong Wang, GuoChen, Caiyan Qin

TL;DR

GraphMind tackles the challenge of evolving intermediate reasoning in LLM-based multi-step deduction by modeling the process as a dynamic heterogeneous graph. It tightly couples a relational GNN for state encoding with a semantic theorem matcher and an LLM that generates conclusions, all in a closed-loop that expands the graph at each step. The approach yields consistent improvements over strong prompting baselines across mathematics, finance, and law QA tasks, and ablations confirm the critical role of the GNN in capturing inter-premise dependencies. This framework offers a principled path toward interpretable, context-aware, and scalable reasoning for complex deductive tasks. The work has potential impact on formal reasoning, mathematical proof construction, and domain-specific AI assistants that require structured, verifiable reasoning traces.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, including multi-step reasoning such as mathematical proving. However, existing approaches often lack an explicit and dynamic mechanism to structurally represent and evolve intermediate reasoning states, which limits their ability to perform context-aware theorem selection and iterative conclusion generation. To address these challenges, we propose GraphMind, a novel dynamic graph-based framework that integrates the graph neural network (GNN) with LLMs to iteratively select theorems and generate intermediate conclusions for multi-step reasoning. Our method models the reasoning process as a heterogeneous evolving graph, where nodes represent conditions, theorems, and conclusions, while edges capture logical dependencies between nodes. By encoding the current reasoning state with GNN and leveraging semantic matching for theorem selection, our framework enables context-aware, interpretable, and structured reasoning in a closed-loop manner. Experiments on various question-answering (QA) datasets demonstrate that our proposed GraphMind method achieves consistent performance improvements and significantly outperforms existing baselines in multi-step reasoning, validating the effectiveness and generalizability of our approach.

GraphMind: Theorem Selection and Conclusion Generation Framework with Dynamic GNN for LLM Reasoning

TL;DR

GraphMind tackles the challenge of evolving intermediate reasoning in LLM-based multi-step deduction by modeling the process as a dynamic heterogeneous graph. It tightly couples a relational GNN for state encoding with a semantic theorem matcher and an LLM that generates conclusions, all in a closed-loop that expands the graph at each step. The approach yields consistent improvements over strong prompting baselines across mathematics, finance, and law QA tasks, and ablations confirm the critical role of the GNN in capturing inter-premise dependencies. This framework offers a principled path toward interpretable, context-aware, and scalable reasoning for complex deductive tasks. The work has potential impact on formal reasoning, mathematical proof construction, and domain-specific AI assistants that require structured, verifiable reasoning traces.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, including multi-step reasoning such as mathematical proving. However, existing approaches often lack an explicit and dynamic mechanism to structurally represent and evolve intermediate reasoning states, which limits their ability to perform context-aware theorem selection and iterative conclusion generation. To address these challenges, we propose GraphMind, a novel dynamic graph-based framework that integrates the graph neural network (GNN) with LLMs to iteratively select theorems and generate intermediate conclusions for multi-step reasoning. Our method models the reasoning process as a heterogeneous evolving graph, where nodes represent conditions, theorems, and conclusions, while edges capture logical dependencies between nodes. By encoding the current reasoning state with GNN and leveraging semantic matching for theorem selection, our framework enables context-aware, interpretable, and structured reasoning in a closed-loop manner. Experiments on various question-answering (QA) datasets demonstrate that our proposed GraphMind method achieves consistent performance improvements and significantly outperforms existing baselines in multi-step reasoning, validating the effectiveness and generalizability of our approach.

Paper Structure

This paper contains 22 sections, 13 equations, 1 figure, 2 tables, 1 algorithm.

Figures (1)

  • Figure 1: Overview of the proposed GraphMind framework, consisting of four core modules: graph encoding, theorem matching, conclusion generation, and graph expansion. The pipeline encodes the evolving reasoning state into a graph structure, selects context-aware theorems, generates intermediate conclusions via LLMs, and updates the graph with new nodes and edges.