STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
Zachary Bamberger, Till R. Saenger, Gilad Morad, Ofra Amir, Brandon M. Stewart, Amir Feder
TL;DR
STATe addresses the need for controllable and interpretable reasoning in inference-time computation for large language models by introducing discrete action templates in a Tree-of-Thoughts framework. The approach uses a Plan→Generate→Evaluate→Select loop with a controller, generator, and evaluator to produce diverse yet high-quality outputs, and demonstrates benefits on NoveltyBench and an argument-generation case study. The key contributions are (1) a controllable action-space search that improves diversity without sacrificing quality, (2) an attribution framework linking action traces to output quality to identify promising strategies, and (3) a method to steer generation toward unexplored but high-potential regions via targeted trajectory exploration. The work provides a practical, interpretable framework for designing diverse, high-quality, and explainable text generation, with publicly available code.
Abstract
Inference-Time-Compute (ITC) methods like Best-of-N and Tree-of-Thoughts are meant to produce output candidates that are both high-quality and diverse, but their use of high-temperature sampling often fails to achieve meaningful output diversity. Moreover, existing ITC methods offer limited control over how to perform reasoning, which in turn limits their explainability. We present STATe-of-Thoughts (STATe), an interpretable ITC method that searches over high-level reasoning patterns. STATe replaces stochastic sampling with discrete and interpretable textual interventions: a controller selects actions encoding high-level reasoning choices, a generator produces reasoning steps conditioned on those choices, and an evaluator scores candidates to guide search. This structured approach yields three main advantages. First, action-guided textual interventions produce greater response diversity than temperature-based sampling. Second, in a case study on argument generation, STATe's explicit action sequences capture interpretable features that are highly predictive of output quality. Third, estimating the association between performance and action choices allows us to identify promising yet unexplored regions of the action space and steer generation directly toward them. Together, these results establish STATe as a practical framework for generating high-quality, diverse, and interpretable text. Our framework is available at https://github.com/zbambergerNLP/state-of-thoughts.
