Table of Contents
Fetching ...

State Machine of Thoughts: Leveraging Past Reasoning Trajectories for Enhancing Problem Solving

Jia Liu, Jie Shuai, Xiyao Li

TL;DR

The proposed State Machine of Thoughts (SMoT) selects the most optimal sub-solutions and avoids incorrect ones and can significantly improve problem-solving abilities in two exploration-intensive problems: the 24-point game and a taxi navigation reinforcement learning game.

Abstract

Current Large Language Model-based agents reason within an exploration-evaluation framework, navigating problem-solving processes in a tree-like manner. However, these methods often neglect successful reasoning trajectories once a problem is resolved, leading to inefficient use of these trajectories for future analogous problems. To address this inefficiency, we adopt a state machine to record experience derived from previous reasoning trajectories. Within the state machine, states represent decomposed sub-problems, while state transitions reflect the dependencies among sub-problems. The state machine records both successful and failed trajectories. Utilizing the experience from the state machine, our proposed State Machine of Thoughts (SMoT) selects the most optimal sub-solutions and avoids incorrect ones. Our experiments show that SMoT can significantly improve problem-solving abilities in two exploration-intensive problems: the 24-point game and a taxi navigation reinforcement learning game.

State Machine of Thoughts: Leveraging Past Reasoning Trajectories for Enhancing Problem Solving

TL;DR

The proposed State Machine of Thoughts (SMoT) selects the most optimal sub-solutions and avoids incorrect ones and can significantly improve problem-solving abilities in two exploration-intensive problems: the 24-point game and a taxi navigation reinforcement learning game.

Abstract

Current Large Language Model-based agents reason within an exploration-evaluation framework, navigating problem-solving processes in a tree-like manner. However, these methods often neglect successful reasoning trajectories once a problem is resolved, leading to inefficient use of these trajectories for future analogous problems. To address this inefficiency, we adopt a state machine to record experience derived from previous reasoning trajectories. Within the state machine, states represent decomposed sub-problems, while state transitions reflect the dependencies among sub-problems. The state machine records both successful and failed trajectories. Utilizing the experience from the state machine, our proposed State Machine of Thoughts (SMoT) selects the most optimal sub-solutions and avoids incorrect ones. Our experiments show that SMoT can significantly improve problem-solving abilities in two exploration-intensive problems: the 24-point game and a taxi navigation reinforcement learning game.
Paper Structure (34 sections, 4 figures, 4 tables, 1 algorithm)

This paper contains 34 sections, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: The motivation for improving solving 24-Point Card Games: The goal is to find a way to combine four given numbers to reach the total of 24. By documenting which sets of numbers can successfully achieve or fail to reach this total based on past trajectories, we can store the experience that helps agents quickly decide the best next move when encountering the same numbers in future games.
  • Figure 2: The illustration of two traversal methods for constructing state machine. The green and red box denote the sub-problems evaluated conducive and non-conducive to solve the task, respectively
  • Figure 3: Comparison of State Machine of Thoughts (SMoT) with other prompting strategy
  • Figure 4: Five manually designed situations for experiments