Table of Contents
Fetching ...

Scattered Forest Search: Smarter Code Space Exploration with LLMs

Jonathan Light, Yue Wu, Yiyou Sun, Wenchao Yu, Yanchi liu, Xujiang Zhao, Ziniu Hu, Haifeng Chen, Wei Cheng

TL;DR

The paper addresses code generation by framing it as black-box optimization over the code space and introduces Scattered Forest Search (SFS), a suite of optimization-inspired techniques that enhance exploration and exploitation during LLM-guided search. SFS combines Scattering (diverse textual directions), Foresting (multi-seed initialization), and Scouting (shared insights) within a Monte Carlo Tree Search framework to avoid local optima and improve inference scaling. Theoretical analysis via Markov chain concepts and extensive empirical validation across HumanEval, MBPP, APPS, CodeContests, and Leetcode demonstrate substantial gains in pass@1 and faster discovery of correct solutions, along with increased solution diversity. The approach scales efficiently with budget and weaker models benefit most from inference-time optimization, offering practical implications for deployable, resource-efficient code-generation systems. Overall, SFS provides a simple, training-free enhancement to search-based code generation that improves accuracy, scalability, and diversity across diverse benchmarks.

Abstract

We frame code generation as a black-box optimization problem within the code space and demonstrate how optimization-inspired techniques can enhance inference scaling. Based on this perspective, we propose SCATTERED FOREST SEARCH (SFS), a novel approach that improves solution diversity and better exploits feedback during evolutionary search. Our theoretical analysis illustrates how these methods help avoid local optima during optimization, leading to more efficient exploration. Extensive experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance gains. For instance, our method achieves a pass@1 rate of 67.1% on HumanEval+ and 87.2% on HumanEval with GPT-3.5, marking improvements of 8.6% and 4.3% over the state-of-the-art, while also halving the iterations needed to find the correct solution. Furthermore, our approach scales more efficiently than existing search techniques, including tree search, line search, and repeated sampling.

Scattered Forest Search: Smarter Code Space Exploration with LLMs

TL;DR

The paper addresses code generation by framing it as black-box optimization over the code space and introduces Scattered Forest Search (SFS), a suite of optimization-inspired techniques that enhance exploration and exploitation during LLM-guided search. SFS combines Scattering (diverse textual directions), Foresting (multi-seed initialization), and Scouting (shared insights) within a Monte Carlo Tree Search framework to avoid local optima and improve inference scaling. Theoretical analysis via Markov chain concepts and extensive empirical validation across HumanEval, MBPP, APPS, CodeContests, and Leetcode demonstrate substantial gains in pass@1 and faster discovery of correct solutions, along with increased solution diversity. The approach scales efficiently with budget and weaker models benefit most from inference-time optimization, offering practical implications for deployable, resource-efficient code-generation systems. Overall, SFS provides a simple, training-free enhancement to search-based code generation that improves accuracy, scalability, and diversity across diverse benchmarks.

Abstract

We frame code generation as a black-box optimization problem within the code space and demonstrate how optimization-inspired techniques can enhance inference scaling. Based on this perspective, we propose SCATTERED FOREST SEARCH (SFS), a novel approach that improves solution diversity and better exploits feedback during evolutionary search. Our theoretical analysis illustrates how these methods help avoid local optima during optimization, leading to more efficient exploration. Extensive experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance gains. For instance, our method achieves a pass@1 rate of 67.1% on HumanEval+ and 87.2% on HumanEval with GPT-3.5, marking improvements of 8.6% and 4.3% over the state-of-the-art, while also halving the iterations needed to find the correct solution. Furthermore, our approach scales more efficiently than existing search techniques, including tree search, line search, and repeated sampling.

Paper Structure

This paper contains 63 sections, 17 equations, 22 figures, 21 tables, 1 algorithm.

Figures (22)

  • Figure 1: 2D Visualization of Code Space represents each point as a possible code solution. The goal is to efficiently search this space for the solution with the best performance, defined by the number of unit tests passed, as indicated by the contours above.
  • Figure 2: Scaling curve for different search methods. We run each method for 10 iterations total using gpt-3.5-turbo on APPS and report the proportion of problems where the correct solution has been discovered at each iteration.
  • Figure 3: Overview of prior methods used for code generation with LLMs. Points represent solutions. Hexagons represent initial solutions. Star represents the final selected solution.
  • Figure 4: Repeated sampling generates multiple solutions using the LLM without leveraging feedback from previous iterations.
  • Figure 5: Line search rigidly exploits feedback and cannot revert to a previous solution if a new change worsens the outcome
  • ...and 17 more figures