A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks

Xinzhe Li

A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks

Xinzhe Li

TL;DR

The paper addresses the challenge of comparing LLM inference via search (LIS) frameworks by presenting a unified MDP-based task formulation, modular LLM-Profiled Roles (policy, evaluator, transition model), and reusable search procedures. It surveys 11 LIS frameworks, analyzes deviations from classical search algorithms, and discusses applicability, performance, and efficiency, including open issues like action undoing and irreversible actions. The work proposes an interchangeable, object-oriented interface to enable fair comparisons and future extensibility, and it highlights practical considerations such as computation cost, memory, and potential for parallel research. Together, these contributions offer a structured blueprint for designing, evaluating, and extending LIS systems in language reasoning, web navigation, graph traversal, and tool-based tasks with LLMs. The paper thus serves as a foundational reference for researchers and engineers building and benchmarking test-time compute via search in real-world LLM applications.

Abstract

LLM test-time compute (or LLM inference) via search has emerged as a promising research area with rapid developments. However, current frameworks often adopt distinct perspectives on three key aspects: task definition, LLM profiling, and search procedures, making direct comparisons challenging. Moreover, the search algorithms employed often diverge from standard implementations, and their specific characteristics are not thoroughly specified. This survey aims to provide a comprehensive but integrated technical review on existing LIS frameworks. Specifically, we unify task definitions under Markov Decision Process (MDP) and provides modular definitions of LLM profiling and search procedures. The definitions enable precise comparisons of various LLM inference frameworks while highlighting their departures from conventional search algorithms. We also discuss the applicability, performance, and efficiency of these methods. For ongoing paper updates, please refer to our GitHub repository: https://github.com/xinzhel/LLM-Search.

A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks

TL;DR

Abstract

Paper Structure (129 sections, 8 equations, 1 figure, 17 tables)

This paper contains 129 sections, 8 equations, 1 figure, 17 tables.

Introduction
Existing Surveys
No dedicated, Detailed Survey
Limited Mention on LLM-Side Design
Limited Mention on Search
Survey Structure
Introducing a Unified Task Definition Based on MDPs
Comprehensively Summarizing LLM Profiling and Implementations
Defining Modular Search Procedures
Reviewing Individual Frameworks
Comparisons with Other Test-Time Frameworks
Analyzing Key Perspectives of LIS Frameworks
Other LLM Inference + Search Directions
Intended Audience and Use Cases
Reviewed Venues
...and 114 more sections

Figures (1)

Figure 1: Survey structure. This table shows a selected subset of the works reviewed. Here are two notes: 1) To avoid duplication, comprehensive lists of related work for Task Definitions and LLM-Profiled Roles are provided in the refered tables; 2) Framework names marked with a sword symbol ($\dagger$) denote those devised by the authors of this survey, rather than being directly drawn from existing literature.

A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks

TL;DR

Abstract

A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks

Authors

TL;DR

Abstract

Table of Contents

Figures (1)