THOUGHTSCULPT: Reasoning with Intermediate Revision and Search
Yizhou Chi, Kevin Yang, Dan Klein
TL;DR
THOUGHTSCULPT addresses the need for iterative reasoning with the ability to revise intermediate outputs in large language models. It proposes a general framework built around three modules—Thought Evaluator, Thought Generator, and Decision Simulator—and employs Monte Carlo Tree Search to navigate a graph of thought nodes with revision actions. Across Story Outline Improvement, Mini-Crosswords, and Constrained Generation, THOUGHTSCULPT consistently outperforms strong baselines, especially when using MCTS, while remaining inference-only and not requiring extra training. The approach offers a flexible, task-agnostic planner for long-form reasoning and creative generation, with potential impact on diverse domains requiring structured, revisable outputs, at the cost of higher computational demand.
Abstract
We present THOUGHTSCULPT, a general reasoning and search method for tasks with outputs that can be decomposed into components. THOUGHTSCULPT explores a search tree of potential solutions using Monte Carlo Tree Search (MCTS), building solutions one action at a time and evaluating according to any domain-specific heuristic, which in practice is often simply an LLM evaluator. Critically, our action space includes revision actions: THOUGHTSCULPT may choose to revise part of its previous output rather than continuing to build the rest of its output. Empirically, THOUGHTSCULPT outperforms state-of-the-art reasoning methods across three challenging tasks: Story Outline Improvement (up to +30% interestingness), Mini-Crosswords Solving (up to +16% word success rate), and Constrained Generation (up to +10% concept coverage).
