Monte Carlo Planning with Large Language Model for Text-Based Game Agents

Zijing Shi; Meng Fang; Ling Chen

Monte Carlo Planning with Large Language Model for Text-Based Game Agents

Zijing Shi, Meng Fang, Ling Chen

TL;DR

This work introduces MC-DML, a Monte Carlo planning framework that integrates a Large Language Model with dynamic in-trial and cross-trial memory to guide action evaluation in text-based games. By embedding LLM reasoning into the MCTS expansion via a learned prior and memory-informed signals, MC-DML addresses the limitations of traditional planning-then-learning approaches and the brittle exploration of pure LLM policies. Empirical results on the Jericho benchmark show that MC-DML achieves superior performance in the initial planning phase across multiple games, including challenging bottleneck scenarios, and ablations demonstrate the critical role of both memory components and dynamic pruning. The approach advances language-grounded planning by enabling more sample-efficient, memory-aware decision making in partially observable, high-branching environments, with potential implications for broader language-conditioned planning tasks.

Abstract

Text-based games provide valuable environments for language-based autonomous agents. However, planning-then-learning paradigms, such as those combining Monte Carlo Tree Search (MCTS) and reinforcement learning (RL), are notably time-consuming due to extensive iterations. Additionally, these algorithms perform uncertainty-driven exploration but lack language understanding and reasoning abilities. In this paper, we introduce the Monte Carlo planning with Dynamic Memory-guided Large language model (MC-DML) algorithm. MC-DML leverages the language understanding and reasoning capabilities of Large Language Models (LLMs) alongside the exploratory advantages of tree search algorithms. Specifically, we enhance LLMs with in-trial and cross-trial memory mechanisms, enabling them to learn from past experiences and dynamically adjust action evaluations during planning. We conduct experiments on a series of text-based games from the Jericho benchmark. Our results demonstrate that the MC-DML algorithm significantly enhances performance across various games at the initial planning phase, outperforming strong contemporary methods that require multiple iterations. This demonstrates the effectiveness of our algorithm, paving the way for more efficient language-grounded planning in complex environments.

Monte Carlo Planning with Large Language Model for Text-Based Game Agents

TL;DR

Abstract

Monte Carlo Planning with Large Language Model for Text-Based Game Agents

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)