Table of Contents
Fetching ...

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems

Mingcong Lei, Honghao Cai, Zezhou Cui, Liangchen Tan, Junkun Hong, Gehan Hu, Shuangyu Zhu, Yimou Wu, Shaohan Jiang, Ge Wang, Yuyuan Yang, Junyuan Tan, Zhenglin Wan, Zhen Li, Shuguang Cui, Yiming Zhao, Yatong Han

TL;DR

RoboMemory addresses the challenge of long-horizon planning in partially observable embodied environments by integrating four memory types (Spatial, Temporal, Episodic, Semantic) within a brain-inspired, parallel architecture. It employs a dynamic Spatial Knowledge Graph and a closed-loop Planner-Critic to enable efficient memory updates and adaptive decision-making, validated on EmbodiedBench and via real-world robot trials. The results show RoboMemory outperforms both open- and closed-source baselines by substantial margins and demonstrates cumulative learning across repeated tasks, though limitations remain in executor reliability and perception. This work provides a scalable memory-augmented foundation that bridges cognitive neuroscience with robotic autonomy, enabling more robust, interactive environmental learning in physical systems.

Abstract

Embodied agents face persistent challenges in real-world environments, including partial observability, limited spatial reasoning, and high-latency multi-memory integration. We present RoboMemory, a brain-inspired framework that unifies Spatial, Temporal, Episodic, and Semantic memory under a parallelized architecture for efficient long-horizon planning and interactive environmental learning. A dynamic spatial knowledge graph (KG) ensures scalable and consistent memory updates, while a closed-loop planner with a critic module supports adaptive decision-making in dynamic settings. Experiments on EmbodiedBench show that RoboMemory, built on Qwen2.5-VL-72B-Ins, improves average success rates by 25% over its baseline and exceeds the closed-source state-of-the-art (SOTA) Gemini-1.5-Pro by 3%. Real-world trials further confirm its capacity for cumulative learning, with performance improving across repeated tasks. These results highlight RoboMemory as a scalable foundation for memory-augmented embodied intelligence, bridging the gap between cognitive neuroscience and robotic autonomy.

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems

TL;DR

RoboMemory addresses the challenge of long-horizon planning in partially observable embodied environments by integrating four memory types (Spatial, Temporal, Episodic, Semantic) within a brain-inspired, parallel architecture. It employs a dynamic Spatial Knowledge Graph and a closed-loop Planner-Critic to enable efficient memory updates and adaptive decision-making, validated on EmbodiedBench and via real-world robot trials. The results show RoboMemory outperforms both open- and closed-source baselines by substantial margins and demonstrates cumulative learning across repeated tasks, though limitations remain in executor reliability and perception. This work provides a scalable memory-augmented foundation that bridges cognitive neuroscience with robotic autonomy, enabling more robust, interactive environmental learning in physical systems.

Abstract

Embodied agents face persistent challenges in real-world environments, including partial observability, limited spatial reasoning, and high-latency multi-memory integration. We present RoboMemory, a brain-inspired framework that unifies Spatial, Temporal, Episodic, and Semantic memory under a parallelized architecture for efficient long-horizon planning and interactive environmental learning. A dynamic spatial knowledge graph (KG) ensures scalable and consistent memory updates, while a closed-loop planner with a critic module supports adaptive decision-making in dynamic settings. Experiments on EmbodiedBench show that RoboMemory, built on Qwen2.5-VL-72B-Ins, improves average success rates by 25% over its baseline and exceeds the closed-source state-of-the-art (SOTA) Gemini-1.5-Pro by 3%. Real-world trials further confirm its capacity for cumulative learning, with performance improving across repeated tasks. These results highlight RoboMemory as a scalable foundation for memory-augmented embodied intelligence, bridging the gap between cognitive neuroscience and robotic autonomy.

Paper Structure

This paper contains 39 sections, 2 theorems, 10 equations, 14 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let $G = (V, E)$ be a finite directed graph with maximum out-degree $D \geq 1$, and let $\mathcal{S} \subseteq V$ be a set of $M$ source vertices. Define the K-hop neighborhood$\mathcal{N}_K(s)$ of a vertex $s \in \mathcal{S}$ as the set of vertices reachable from $s$ via directed paths of length at Satisfies the following upper bound:

Figures (14)

  • Figure 1: RoboMemory adopts a brain-inspired architecture that maps neural components to agent modules, enabling long-term planning and interactive learning across diverse environments (real-world, Habitat, ALFRED) and robotic hardware.
  • Figure 2: RoboMemory architecture. (a) Left: Parallel Step Summarizer and Query Generator generate updates/queries for Comprehensive Embodied Memory. These memories enable Closed-Loop Planning for tasks like "slice and pick up the apple"---the Planner generates plans, while the Critic and memories adjust decisions via feedback from visual inputs/results. (b) Right: Spatial memory maintains a relevance/similarity-updated KG, and Semantic/Episodic memory manages a Vector DB with analogous logic. Besides, Temporal memory is implemented as a linear FIFO buffer that stores step-wise summaries generated by the Step Summarizer.
  • Figure 3: Efficiency improvement of Comprehensive Embodied Memory System
  • Figure 4: Visualization of the experimental environment.
  • Figure 5: The improvement of RoboMemory after learning in the real world.
  • ...and 9 more figures

Theorems & Definitions (4)

  • Theorem 1: Upper Bound on K-hop Vertex Extraction in Directed Graphs
  • proof
  • Theorem 2: Upper Bound for K-hop Vertex Extraction in Normalized Directed Graphs
  • proof