RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

Zhenhang Yuan; Shenghai Yuan; Lihua Xie

RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

Zhenhang Yuan, Shenghai Yuan, Lihua Xie

Abstract

LLM agents often fail in closed-world embodied environments because actions must satisfy strict preconditions -- such as location, inventory, and container states -- and failure feedback is sparse. We identify two structurally coupled failure modes: (P1) invalid action generation and (P2) state drift, each amplifying the other in a degenerative cycle. We present RPMS, a conflict-managed architecture that enforces action feasibility via structured rule retrieval, gates memory applicability via a lightweight belief state, and resolves conflicts between the two sources via rules-first arbitration. On ALFWorld (134 unseen tasks), RPMS achieves 59.7% single-trial success with Llama 3.1 8B (+23.9 pp over baseline) and 98.5% with Claude Sonnet 4.5 (+11.9 pp); of the 8B gain, rule retrieval alone contributes +14.9 pp (statistically significant), making it the dominant factor. A key finding is that episodic memory is conditionally useful: it harms performance on some task types when used without grounding, but becomes a stable net positive once filtered by current state and constrained by explicit action rules. Adapting RPMS to ScienceWorld with GPT-4 yields consistent gains across all ablation conditions (avg. score 54.0 vs. 44.9 for the ReAct baseline), providing transfer evidence that the core mechanisms hold across structurally distinct environments.

RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

Abstract

Paper Structure (63 sections, 7 equations, 5 figures, 10 tables, 1 algorithm)

This paper contains 63 sections, 7 equations, 5 figures, 10 tables, 1 algorithm.

Introduction
Related Work
LLMs for Embodied Decision-Making
Reflective and Memory-Augmented Agents
Knowledge-Augmented Planning
Method
Belief State Tracking
Sufficiency.
Signatures.
Hierarchical Rule Manual
Episodic Memory with State-Consistent Filtering
Retrieval.
State-Consistent Filtering (SCF).
Rules-First Arbitration
Prompt Construction and Decision Procedure
...and 48 more sections

Figures (5)

Figure 1: Representative methods positioned along C1 (executability enforcement) and C2 (state consistency control): ReAct yao2023react, Generative Agents 10.1145/3586183.3606763, Reflexion shinn2023reflexion, MemGPT packer2024memgptllmsoperatingsystems, Inner Monologue huang2022innermonologue, Voyager wang2023voyager, LATS pmlr-v235-zhou24r, Action Attention wu2022tackling, Imperative Learning doi:10.1177/02783649251353181, CAPE raman2024capecorrectiveactionsprecondition, NeSyC choi2025nesyc. Most prior work addresses one axis; RPMS targets both.
Figure 2: RPMS architecture. Each step: (1) parse observation into BeliefState and GoalSpec; (2) query Rule Manual and Episodic Memory in parallel; (3) filter experiences by state-signature compatibility; (4) resolve conflicts via Rules-First Arbitration; (5) query LLM with augmented prompt.
Figure 3: RPMS agent architecture overview, showing rule injection (C1: executability enforcement) and state-consistent memory filtering (C2: state consistency control) as the two core components that augment the LLM decision loop.
Figure 4: 2$\times$2 ablation results on ALFWorld (left, success rate %; backbone: Llama 3.1 8B) and ScienceWorld (right, avg. score 0--100; backbone: GPT-4). Dotted lines mark the additive-expectation baseline in each environment.
Figure 5: Learning curve: success rate vs. learning rounds for Memory-only and Rules+Memory configurations.

RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

Abstract

RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy

Authors

Abstract

Table of Contents

Figures (5)