Table of Contents
Fetching ...

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

Taeyoon Kwon, Dongwook Choi, Hyojun Kim, Sunghwan Kim, Seungjun Moon, Beong-woo Kwak, Kuan-Hao Huang, Jinyoung Yeo

TL;DR

The paper addresses the challenge of personalized assistance in embodied agents by dissecting memory utilization along object semantics and user patterns. It introduces MEMENTO, a two-stage evaluation framework with single-memory and joint-memory tasks to quantify how well memories influence grounding and planning, revealing bottlenecks from information overload and poor multi-memory coordination. Through experiments across multiple models, the authors show that episodic memory alone is insufficient and that a hierarchical knowledge-graph-based user-profile memory significantly improves performance, especially for sequential user-pattern reasoning. The work highlights the need for dedicated memory architectures that separate personalized knowledge from contextual histories to enable robust, scalable personalized embodied agents with practical implications for long-term, user-specific interactions.

Abstract

LLM-powered embodied agents have shown success on conventional object-rearrangement tasks, but providing personalized assistance that leverages user-specific knowledge from past interactions presents new challenges. We investigate these challenges through the lens of agents' memory utilization along two critical dimensions: object semantics (identifying objects based on personal meaning) and user patterns (recalling sequences from behavioral routines). To assess these capabilities, we construct MEMENTO, an end-to-end two-stage evaluation framework comprising single-memory and joint-memory tasks. Our experiments reveal that current agents can recall simple object semantics but struggle to apply sequential user patterns to planning. Through in-depth analysis, we identify two critical bottlenecks: information overload and coordination failures when handling multiple memories. Based on these findings, we explore memory architectural approaches to address these challenges. Given our observation that episodic memory provides both personalized knowledge and in-context learning benefits, we design a hierarchical knowledge graph-based user-profile memory module that separately manages personalized knowledge, achieving substantial improvements on both single and joint-memory tasks. Project website: https://connoriginal.github.io/MEMENTO

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

TL;DR

The paper addresses the challenge of personalized assistance in embodied agents by dissecting memory utilization along object semantics and user patterns. It introduces MEMENTO, a two-stage evaluation framework with single-memory and joint-memory tasks to quantify how well memories influence grounding and planning, revealing bottlenecks from information overload and poor multi-memory coordination. Through experiments across multiple models, the authors show that episodic memory alone is insufficient and that a hierarchical knowledge-graph-based user-profile memory significantly improves performance, especially for sequential user-pattern reasoning. The work highlights the need for dedicated memory architectures that separate personalized knowledge from contextual histories to enable robust, scalable personalized embodied agents with practical implications for long-term, user-specific interactions.

Abstract

LLM-powered embodied agents have shown success on conventional object-rearrangement tasks, but providing personalized assistance that leverages user-specific knowledge from past interactions presents new challenges. We investigate these challenges through the lens of agents' memory utilization along two critical dimensions: object semantics (identifying objects based on personal meaning) and user patterns (recalling sequences from behavioral routines). To assess these capabilities, we construct MEMENTO, an end-to-end two-stage evaluation framework comprising single-memory and joint-memory tasks. Our experiments reveal that current agents can recall simple object semantics but struggle to apply sequential user patterns to planning. Through in-depth analysis, we identify two critical bottlenecks: information overload and coordination failures when handling multiple memories. Based on these findings, we explore memory architectural approaches to address these challenges. Given our observation that episodic memory provides both personalized knowledge and in-context learning benefits, we design a hierarchical knowledge graph-based user-profile memory module that separately manages personalized knowledge, achieving substantial improvements on both single and joint-memory tasks. Project website: https://connoriginal.github.io/MEMENTO

Paper Structure

This paper contains 80 sections, 1 equation, 24 figures, 21 tables, 2 algorithms.

Figures (24)

  • Figure 1: Comparison between conventional embodied tasks and personalized assistance tasks. Previous works focus on following simple instructions, while personalized assistance agents must know user-specific knowledge, which require grounding in past interactions.
  • Figure 2: Overview of Memento. The framework evaluates memory utilization capability by comparing agent performance on tasks with identical goals but varying instructions on each stage.
  • Figure 3: The performance results without using episodic memory. Original indicates episodes from PartNR dataset.
  • Figure 4: The results of personalized knowledge type based analysis (single-memory).
  • Figure 5: Taxonomy of successful and failed cases in memory utilization with illustrative examples. Top: success and failure cases of object semantics; Bottom: success and failure cases of user patterns.
  • ...and 19 more figures