Table of Contents
Fetching ...

RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation

Qingyao Li, Wei Xia, Kounianhua Du, Xinyi Dai, Ruiming Tang, Yasheng Wang, Yong Yu, Weinan Zhang

TL;DR

RethinkMCTS introduces a thought-search framework that merges Monte Carlo Tree Search with a rethink refinement mechanism to explicitly explore and correct the reasoning processes behind code generation. By incorporating block-level execution feedback and a dual evaluation scheme, the approach guides search toward higher-quality reasoning traces and code, outperforming prior reflection-based and tree-search baselines on APPS and HumanEval. Key contributions include the integration of a formal thought-space search, fine-grained verbal feedback, and a refinement step that regenerates erroneous thoughts without degrading accumulated rewards. The work demonstrates that refining the reasoning pathways, rather than merely logging errors, yields substantial improvements in code generation quality and offers a generalizable paradigm for reasoning-intensive LLM tasks.

Abstract

Tree search methods have demonstrated impressive performance in code generation. Previous methods combine tree search with reflection that summarizes past mistakes to achieve iterative improvement. However, these methods face significant challenges. First, they search directly within the code language space, neglecting the underlying reasoning process critical for effective code generation. Second, reflection-based approaches merely accumulate historical errors in memory without providing correct reasoning pathways, making it difficult for subsequent search iterations to identify optimal solutions, resulting in decreased search quality. In this work, we propose RethinkMCTS, a framework that systematically explores and refines the reasoning process for code generation. Specifically, we employ MCTS to search for thoughts before code generation and integrate MCTS with a refinement mechanism called rethink, which incorporates fine-grained code execution feedback to refine erroneous thoughts during the search. It ensures the search path aligns with better reasoning, improving overall search quality. Through extensive experiments, we demonstrate that RethinkMCTS outperforms previous search-based and feedback-enhanced code generation baselines.

RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation

TL;DR

RethinkMCTS introduces a thought-search framework that merges Monte Carlo Tree Search with a rethink refinement mechanism to explicitly explore and correct the reasoning processes behind code generation. By incorporating block-level execution feedback and a dual evaluation scheme, the approach guides search toward higher-quality reasoning traces and code, outperforming prior reflection-based and tree-search baselines on APPS and HumanEval. Key contributions include the integration of a formal thought-space search, fine-grained verbal feedback, and a refinement step that regenerates erroneous thoughts without degrading accumulated rewards. The work demonstrates that refining the reasoning pathways, rather than merely logging errors, yields substantial improvements in code generation quality and offers a generalizable paradigm for reasoning-intensive LLM tasks.

Abstract

Tree search methods have demonstrated impressive performance in code generation. Previous methods combine tree search with reflection that summarizes past mistakes to achieve iterative improvement. However, these methods face significant challenges. First, they search directly within the code language space, neglecting the underlying reasoning process critical for effective code generation. Second, reflection-based approaches merely accumulate historical errors in memory without providing correct reasoning pathways, making it difficult for subsequent search iterations to identify optimal solutions, resulting in decreased search quality. In this work, we propose RethinkMCTS, a framework that systematically explores and refines the reasoning process for code generation. Specifically, we employ MCTS to search for thoughts before code generation and integrate MCTS with a refinement mechanism called rethink, which incorporates fine-grained code execution feedback to refine erroneous thoughts during the search. It ensures the search path aligns with better reasoning, improving overall search quality. Through extensive experiments, we demonstrate that RethinkMCTS outperforms previous search-based and feedback-enhanced code generation baselines.
Paper Structure (45 sections, 3 equations, 8 figures, 14 tables, 1 algorithm)

This paper contains 45 sections, 3 equations, 8 figures, 14 tables, 1 algorithm.

Figures (8)

  • Figure 1: Comparison between reflection-based methods and RethinkMCTS. Reflection-based methods would maintain the error in the path, while RethinkMCTS would refine erroneous thoughts and continue along a better path.
  • Figure 2: Overview of RethinkMCTS. We use MCTS to explore different thoughts before generating code. We obtain block-level analysis as verbal feedback through a code executor and use the verbal feedback from failed test cases to refine the thoughts, thereby improving the overall quality of the search tree.
  • Figure 3: Ablation study of block-level analysis (blockInfo), rethink mechanism, verbal feedback (VF), and self-evaluation with GPT-3.5-turbo as the backbone.
  • Figure 4: Performance comparison between different search granularity. For advanced models like GPT-3.5-turbo, it's better to explore at the thought level.
  • Figure 5: Performance comparison between rethink more times and more rollouts without rethink. rethink is more effective than increasing rollouts.
  • ...and 3 more figures