Table of Contents
Fetching ...

LARP: Language-Agent Role Play for Open-World Games

Ming Yan, Ruihao Li, Hao Zhang, Hao Wang, Zhilan Yang, Ji Yan

TL;DR

LARP tackles the challenge of creating flexible, memory-rich language agents for open-world games by proposing a modular cognitive architecture (long-term and working memory, memory processing, decision making) paired with an environment interaction layer that uses a learnable action space and postprocessing to align personalities. It leverages a cluster of domain-specialized language models, memory-encoded with probabilistic and logic representations, and memory recall via self-ask and multi-modal search, enhanced by entity APIs and RLHF-driven refinement. The framework also emphasizes diverse persona modeling through LoRA-tuned model clusters and robust post-processing to verify actions and prevent conflicts. Collectively, LARP aims to deliver coherent, culturally varied NPCs and scalable, adaptive behaviors that enrich open-world gameplay and user experience.

Abstract

Language agents have shown impressive problem-solving skills within defined settings and brief timelines. Yet, with the ever-evolving complexities of open-world simulations, there's a pressing need for agents that can flexibly adapt to complex environments and consistently maintain a long-term memory to ensure coherent actions. To bridge the gap between language agents and open-world games, we introduce Language Agent for Role-Playing (LARP), which includes a cognitive architecture that encompasses memory processing and a decision-making assistant, an environment interaction module with a feedback-driven learnable action space, and a postprocessing method that promotes the alignment of various personalities. The LARP framework refines interactions between users and agents, predefined with unique backgrounds and personalities, ultimately enhancing the gaming experience in open-world contexts. Furthermore, it highlights the diverse uses of language models in a range of areas such as entertainment, education, and various simulation scenarios. The project page is released at https://miao-ai-lab.github.io/LARP/.

LARP: Language-Agent Role Play for Open-World Games

TL;DR

LARP tackles the challenge of creating flexible, memory-rich language agents for open-world games by proposing a modular cognitive architecture (long-term and working memory, memory processing, decision making) paired with an environment interaction layer that uses a learnable action space and postprocessing to align personalities. It leverages a cluster of domain-specialized language models, memory-encoded with probabilistic and logic representations, and memory recall via self-ask and multi-modal search, enhanced by entity APIs and RLHF-driven refinement. The framework also emphasizes diverse persona modeling through LoRA-tuned model clusters and robust post-processing to verify actions and prevent conflicts. Collectively, LARP aims to deliver coherent, culturally varied NPCs and scalable, adaptive behaviors that enrich open-world gameplay and user experience.

Abstract

Language agents have shown impressive problem-solving skills within defined settings and brief timelines. Yet, with the ever-evolving complexities of open-world simulations, there's a pressing need for agents that can flexibly adapt to complex environments and consistently maintain a long-term memory to ensure coherent actions. To bridge the gap between language agents and open-world games, we introduce Language Agent for Role-Playing (LARP), which includes a cognitive architecture that encompasses memory processing and a decision-making assistant, an environment interaction module with a feedback-driven learnable action space, and a postprocessing method that promotes the alignment of various personalities. The LARP framework refines interactions between users and agents, predefined with unique backgrounds and personalities, ultimately enhancing the gaming experience in open-world contexts. Furthermore, it highlights the diverse uses of language models in a range of areas such as entertainment, education, and various simulation scenarios. The project page is released at https://miao-ai-lab.github.io/LARP/.
Paper Structure (15 sections, 1 equation, 4 figures)

This paper contains 15 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: Cognitive Architecture of LARP Overview.
  • Figure 2: Cognitive Workflow of LARP. This represents a cycle: Information from long-term memory and observation is processed in the memory processing module and transmitted to the working memory module. The information in the working memory module, together with the observed information, is inputted into the decision-making assistant, which finally generates a decision or dialogue. Memory processing has three main stages: encoding, storage, and recall. Encoding is the process of transforming information into a form that can be stored in memory. Storage is the process of maintaining information in memory. Recall is the process of retrieving information from memory.
  • Figure 3: Detail control flow of recall psychological process. First conduct self-asking about the observation to get self-ask questions. Using the self-ask questions as queries, different methods of retrieval are undertaken. 1. Generate predicate logic statements in logic programming language and probabilistic programming language based on queries. 2. Conducting a vector similarity search after extracting keywords from the queries. 3. Searching for question-answer pairs based on sentence similarity between queries and the questions of question-answer pairs. $Q_{self-ask}$ means the self-ask questions which were used as queries, $Q_{logic}$ stands for predicate logic query statements, $Q_{key}$ is the extracted keywords, ${Q'A}$ stands for the question-answer pairs.
  • Figure 4: Environment Interaction.