ProofSketch: Efficient Verified Reasoning for Large Language Models
Disha Sheshanarayana, Tanishka Magar
TL;DR
ProofSketch tackles the inefficiency of long reasoning chains in large language models by introducing a verification-guided framework that generates multiple short, atomic sketches and selects the one with maximal verification coverage. It combines symbolic closure computation with verifier-gated generation and lexicographic verification to balance accuracy, efficiency, and formal guarantees. The approach is formalized as a multi-objective optimization with token budgets and certification requirements, and is evaluated on the ProofWriter dataset across multiple models, demonstrating substantial token savings and meaningful certification rates while maintaining competitive accuracy. The work advances trustworthy neural reasoning in compute-constrained settings and highlights avenues to further reduce latency and extend to more complex domains.
Abstract
Reasoning methods such as chain-of-thought prompting and self-consistency have shown immense potential to improve the accuracy of large language models across various reasoning tasks. However such methods involve generation of lengthy reasoning chains, which substantially increases token consumption, computational cost, and latency. To address this inefficiency, we propose ProofSketch, a verification-guided reasoning framework that integrates symbolic closure computation, lexicographic verification and adaptive sketch generation. Our experiments show that ProofSketch consistently reduces token usage while improving accuracy, demonstrating that this approach offers a promising path for efficient and trustworthy reasoning.
