RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation
Qingyao Li, Wei Xia, Kounianhua Du, Xinyi Dai, Ruiming Tang, Yasheng Wang, Yong Yu, Weinan Zhang
TL;DR
RethinkMCTS introduces a thought-search framework that merges Monte Carlo Tree Search with a rethink refinement mechanism to explicitly explore and correct the reasoning processes behind code generation. By incorporating block-level execution feedback and a dual evaluation scheme, the approach guides search toward higher-quality reasoning traces and code, outperforming prior reflection-based and tree-search baselines on APPS and HumanEval. Key contributions include the integration of a formal thought-space search, fine-grained verbal feedback, and a refinement step that regenerates erroneous thoughts without degrading accumulated rewards. The work demonstrates that refining the reasoning pathways, rather than merely logging errors, yields substantial improvements in code generation quality and offers a generalizable paradigm for reasoning-intensive LLM tasks.
Abstract
Tree search methods have demonstrated impressive performance in code generation. Previous methods combine tree search with reflection that summarizes past mistakes to achieve iterative improvement. However, these methods face significant challenges. First, they search directly within the code language space, neglecting the underlying reasoning process critical for effective code generation. Second, reflection-based approaches merely accumulate historical errors in memory without providing correct reasoning pathways, making it difficult for subsequent search iterations to identify optimal solutions, resulting in decreased search quality. In this work, we propose RethinkMCTS, a framework that systematically explores and refines the reasoning process for code generation. Specifically, we employ MCTS to search for thoughts before code generation and integrate MCTS with a refinement mechanism called rethink, which incorporates fine-grained code execution feedback to refine erroneous thoughts during the search. It ensures the search path aligns with better reasoning, improving overall search quality. Through extensive experiments, we demonstrate that RethinkMCTS outperforms previous search-based and feedback-enhanced code generation baselines.
