Table of Contents
Fetching ...

Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design

Zhi Zheng, Zhuoliang Xie, Zhenkun Wang, Bryan Hooi

TL;DR

This paper tackles the problem of designing high-quality heuristics for complex optimization tasks when using LLM-based automatic heuristic design (AHD). It introduces MCTS-AHD, a Monte Carlo Tree Search-based framework that preserves all generated heuristics and uses a tree-structured exploration with progressive widening to better develop underperforming heuristics and avoid local optima. Through extensive experiments on NP-hard combinatorial optimization problems and cost-aware Bayesian optimization, MCTS-AHD demonstrates superior heuristic quality compared with handcrafted heuristics and prior LLM-based AHD methods, across multiple solving frameworks. The approach offers a robust, framework-agnostic method to expand the space of potential heuristics and has broad applicability beyond traditional CO problems.

Abstract

Handcrafting heuristics for solving complex optimization tasks (e.g., route planning and task allocation) is a common practice but requires extensive domain knowledge. Recently, Large Language Model (LLM)-based automatic heuristic design (AHD) methods have shown promise in generating high-quality heuristics without manual interventions. Existing LLM-based AHD methods employ a population to maintain a fixed number of top-performing LLM-generated heuristics and introduce evolutionary computation (EC) to iteratively enhance the population. However, these population-based procedures cannot fully develop the potential of each heuristic and are prone to converge into local optima. To more comprehensively explore the space of heuristics, this paper proposes to use Monte Carlo Tree Search (MCTS) for LLM-based heuristic evolution. The proposed MCTS-AHD method organizes all LLM-generated heuristics in a tree structure and can better develop the potential of temporarily underperforming heuristics. In experiments, MCTS-AHD delivers significantly higher-quality heuristics on various complex tasks. Our code is available.

Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design

TL;DR

This paper tackles the problem of designing high-quality heuristics for complex optimization tasks when using LLM-based automatic heuristic design (AHD). It introduces MCTS-AHD, a Monte Carlo Tree Search-based framework that preserves all generated heuristics and uses a tree-structured exploration with progressive widening to better develop underperforming heuristics and avoid local optima. Through extensive experiments on NP-hard combinatorial optimization problems and cost-aware Bayesian optimization, MCTS-AHD demonstrates superior heuristic quality compared with handcrafted heuristics and prior LLM-based AHD methods, across multiple solving frameworks. The approach offers a robust, framework-agnostic method to expand the space of potential heuristics and has broad applicability beyond traditional CO problems.

Abstract

Handcrafting heuristics for solving complex optimization tasks (e.g., route planning and task allocation) is a common practice but requires extensive domain knowledge. Recently, Large Language Model (LLM)-based automatic heuristic design (AHD) methods have shown promise in generating high-quality heuristics without manual interventions. Existing LLM-based AHD methods employ a population to maintain a fixed number of top-performing LLM-generated heuristics and introduce evolutionary computation (EC) to iteratively enhance the population. However, these population-based procedures cannot fully develop the potential of each heuristic and are prone to converge into local optima. To more comprehensively explore the space of heuristics, this paper proposes to use Monte Carlo Tree Search (MCTS) for LLM-based heuristic evolution. The proposed MCTS-AHD method organizes all LLM-generated heuristics in a tree structure and can better develop the potential of temporarily underperforming heuristics. In experiments, MCTS-AHD delivers significantly higher-quality heuristics on various complex tasks. Our code is available.
Paper Structure (56 sections, 12 equations, 6 figures, 14 tables, 1 algorithm)

This paper contains 56 sections, 12 equations, 6 figures, 14 tables, 1 algorithm.

Figures (6)

  • Figure 1: The generally adopted population (a) in existing LLM-based AHD methods liu2024evolutionye2024reevo directly discards low-performance heuristics (under the red dashed line in (a)), thus falling into local optima. MCTS provides chances to develop low-performance heuristics, so it can more comprehensively explore the space of heuristic functions with different features.
  • Figure 2: LLM-based actions in MCTS-AHD for heuristic evolution. Actions include initializing a new heuristic (i1); two mutation actions (m1 and m2) to mutate an existing heuristic function into a new one with diverse mechanism or detail settings; two crossover actions (e1 and e2) to generate a new heuristic from multiple existing ones; and a novel tree-path reasoning action (s1) to get a better heuristic function from organized function samples on an MCTS tree path from the root node $n_r$ to a leaf node $n_l$.
  • Figure 3: The MCTS process in MCTS-AHD contains four stages, i.e., selection, expansion, simulation, and backpropagation. MCTS-AHD simulates the quality value of each node as the performance function values of their heuristics and the MCTS will terminate after total $T$ performance evaluations. MCTS-AHD introduces the progressive widening technique to better crossover original heuristic functions with continuously generated new ones. It conducts the crossover actions e1 for the root node and action e2 for other nodes.
  • Figure 4: Evolution curves on two diverse application scenarios.
  • Figure 5: The relation of approximate heuristic space size $n$ and the optimality gap ratio between EoH and MCTS-AHD (i.e., $1-\text{Gap}_{MCTS-AHD}/\text{Gap}_{EoH}$). We consider EoH as a baseline for its applicability in all tasks. We do not include tasks with unavailable optimal objective values (e.g., online BPP). The legends have been simplified, for example, TSP-GLS-4o represents designing GLS heuristics for TSP with GPT-4o-mini. SC in the figure is an abbreviation for step-by-step construction.
  • ...and 1 more figures