Asking LLMs to Verify First is Almost Free Lunch
Shiguang Wu, Quanming Yao
TL;DR
The paper tackles the high costs of improving LLM reasoning by introducing Verification-First prompting, which asks models to verify a candidate answer before solving, leveraging a reverse reasoning path that complements forward chain-of-thought. It generalizes this idea into Iter-VF for test-time scaling, enabling sequential verification that maintains a compact context window. Across diverse benchmarks and model families, VF with random/trivial answers consistently outperforms standard CoT, and Iter-VF outperforms existing test-time strategies with minimal overhead. The approach proves robust in real-world, open-ended tasks and under thought-hidden LLM services, offering a training-free, cost-effective path to stronger reasoning.
Abstract
To enhance the reasoning capabilities of Large Language Models (LLMs) without high costs of training, nor extensive test-time sampling, we introduce Verification-First (VF), a strategy that prompts models to verify a provided candidate answer, even a trivial or random one, before generating a solution. This approach triggers a "reverse reasoning" process that is cognitively easier and complementary to standard forward Chain-of-Thought (CoT), effectively invoking the model's critical thinking to reduce logical errors. We further generalize the VF strategy to Iter-VF, a sequential test-time scaling (TTS) method that iteratively cycles the verification-generation process using the model's previous answer. Extensive experiments across various benchmarks (from mathematical reasoning to coding and agentic tasks) and various LLMs (from open-source 1B to cutting-edge commercial ones) confirm that VF with random answer consistently outperforms standard CoT with minimal computational overhead, and Iter-VF outperforms existing TTS strategies.
