Graph of Verification: Structured Verification of LLM Reasoning with Directed Acyclic Graphs
Jiwei Fang, Bin Zhang, Changwei Wang, Jin Wan, Zhiwei Xu
TL;DR
This work presents Graph of Verification (GoV), a training-free framework that verifies LLM reasoning by modeling it as a directed acyclic graph and using a flexible node-block architecture to adapt verification granularity to the task. It formalizes a two-dimensional design space—Verification Granularity and Contextual Scope—to unify verification across well-structured and loosely-structured reasoning. GoV employs atomic nodes or node blocks within topologically ordered verification, guided by Premise provisions, to localize errors and improve robustness. Empirical results on Number Triangle Summation and ProcessBench show GoV outperforms holistic verification and prior decomposition-based methods, demonstrating strong applicability to both formal and natural-language reasoning. The framework advances transparent, scalable verification and opens avenues toward interactive correction workflows.
Abstract
Verifying the complex and multi-step reasoning of Large Language Models (LLMs) is a critical challenge, as holistic methods often overlook localized flaws. Step-by-step validation is a promising alternative, yet existing methods are often rigid. They struggle to adapt to diverse reasoning structures, from formal proofs to informal natural language narratives. To address this adaptability gap, we propose the Graph of Verification (GoV), a novel framework for adaptable and multi-granular verification. GoV's core innovation is its flexible "node block" architecture. This mechanism allows GoV to adaptively adjust its verification granularity--from atomic steps for formal tasks to entire paragraphs for natural language--to match the native structure of the reasoning process. This flexibility allows GoV to resolve the fundamental trade-off between verification precision and robustness. Experiments on both well-structured and loosely-structured benchmarks demonstrate GoV's versatility. The results show that GoV's adaptive approach significantly outperforms both holistic baselines and other state-of-the-art decomposition-based methods, establishing a new standard for training-free reasoning verification.
