Table of Contents
Fetching ...

Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided Search

Kamel Alrashedy, Vriksha Srihari, Zulfiqar Zaidi, Ridam Srivastava, Pradyumna Tambwekar, Matthew Gombolay

TL;DR

Constraints-of-Thought (Const-o-T) introduces a structured ⟨intent, constraint⟩ representation to guide LLM-driven planning via Monte Carlo Tree Search, transforming high-level strategies into verifiable constraints that prune infeasible paths and align with user intent. By integrating Const-o-T with $MCTS$ in a $POMDP$-based planning framework, the approach improves planning efficiency, reduces hallucinations, and yields more semantically valid plans across risk games, CAD code generation, and arithmetic reasoning. Empirical results show higher accuracy and structural alignment than CoT/ToT baselines, lower branching factors, and faster inference times, with a user study confirming improved transparency and usability. The work suggests that constraint-guided reasoning provides a generalizable foundation for reliable, domain-adaptive planning in complex, multi-step tasks.

Abstract

While researchers have made significant progress in enabling large language models (LLMs) to perform multi-step planning, LLMs struggle to ensure that those plans align with high-level user intent and satisfy symbolic constraints, especially in complex, multi-step domains. Existing reasoning approaches such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and verifier-augmented methods, expand the search space but often yield infeasible actions or hallucinated steps. To overcome these limitations, we propose Constraints-of-Thought (Const-o-T), a framework that provides a structured prior that enables Monte Carlo Tree Search (MCTS) focus search on semantically meaningful paths. Each reasoning step is represented as an (intent, constraint) pair, which serves both to compress the search space and enforce validity. Unlike prior methods that merely generate reasoning traces or validate outputs post hoc, Const-o-T uses (intent, constraint)pairs to actively focus the search toward feasible and meaningful plans. We integrate Const-o-T into MCTS using a structured representation of intent-constraint pairs constraints prune infeasible branches and guide exploration toward semantically valid actions, improving planning efficiency and verifiable decision-making. We demonstrate across three domains Risk game, CAD code generation, and arithmetic reasoning that our approach outperforms baselines, yielding higher accuracy and stronger structural alignment. Our contribution is to demonstrate that Const-of-T offers a generalizable foundation for constraint-guided reasoning, enabling more efficient, constraint-aligned, and domain-adaptable planning with LLMs.

Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided Search

TL;DR

Constraints-of-Thought (Const-o-T) introduces a structured ⟨intent, constraint⟩ representation to guide LLM-driven planning via Monte Carlo Tree Search, transforming high-level strategies into verifiable constraints that prune infeasible paths and align with user intent. By integrating Const-o-T with in a -based planning framework, the approach improves planning efficiency, reduces hallucinations, and yields more semantically valid plans across risk games, CAD code generation, and arithmetic reasoning. Empirical results show higher accuracy and structural alignment than CoT/ToT baselines, lower branching factors, and faster inference times, with a user study confirming improved transparency and usability. The work suggests that constraint-guided reasoning provides a generalizable foundation for reliable, domain-adaptive planning in complex, multi-step tasks.

Abstract

While researchers have made significant progress in enabling large language models (LLMs) to perform multi-step planning, LLMs struggle to ensure that those plans align with high-level user intent and satisfy symbolic constraints, especially in complex, multi-step domains. Existing reasoning approaches such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and verifier-augmented methods, expand the search space but often yield infeasible actions or hallucinated steps. To overcome these limitations, we propose Constraints-of-Thought (Const-o-T), a framework that provides a structured prior that enables Monte Carlo Tree Search (MCTS) focus search on semantically meaningful paths. Each reasoning step is represented as an (intent, constraint) pair, which serves both to compress the search space and enforce validity. Unlike prior methods that merely generate reasoning traces or validate outputs post hoc, Const-o-T uses (intent, constraint)pairs to actively focus the search toward feasible and meaningful plans. We integrate Const-o-T into MCTS using a structured representation of intent-constraint pairs constraints prune infeasible branches and guide exploration toward semantically valid actions, improving planning efficiency and verifiable decision-making. We demonstrate across three domains Risk game, CAD code generation, and arithmetic reasoning that our approach outperforms baselines, yielding higher accuracy and stronger structural alignment. Our contribution is to demonstrate that Const-of-T offers a generalizable foundation for constraint-guided reasoning, enabling more efficient, constraint-aligned, and domain-adaptable planning with LLMs.

Paper Structure

This paper contains 31 sections, 6 equations, 15 figures, 8 tables, 2 algorithms.

Figures (15)

  • Figure 1: Const-of-T empowers LLMs to (i) infer intent statements, and (ii) extract a corresponding constraint from a high-level strategy, guiding MCTS toward optimal, rule-compliant actions.
  • Figure 2: Distribution of plan lengths relative to ground truth for GPT-4.
  • Figure 3: User study ratings across three interaction modes: alignment, agnostic, and adversarial. Statistical significance is indicated by asterisks ($^*p < 0.05$, $^{**}p < 0.01$, $^{***}p < 0.001$).
  • Figure 4: Branching factor with error bars across search steps for GPT-4 (left) and LLaMA-3 (right).
  • Figure 5: Average inference time per example for GPT-4 and LLaMA 3.3 across three approaches.
  • ...and 10 more figures