UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents
Han Xiao, Guozhi Wang, Hao Wang, Shilong Liu, Yuxiang Chai, Yue Pan, Yufeng Zhou, Xiaoxin Chen, Yafei Wen, Hongsheng Li
TL;DR
UI-Mem tackles the core bottlenecks of online reinforcement learning for mobile GUI agents: credit assignment in long-horizon tasks and lack of cross-task experience transfer. It introduces a Hierarchical Experience Memory that stores parameterized templates for workflows, subtask skills, and failure patterns, enabling cross-application reuse. The framework combines Memory-Guided Exploration with Stratified Group Sampling and a Self-Evolving Loop, yielding dense subtask guidance during training while gradually internalizing memory through adaptive retrieval and memory updates. Empirical results on challenging GUI benchmarks show significant gains over baselines and strong cross-task generalization, highlighting the practical impact of structured, evolving experience for GUI agents.
Abstract
Online Reinforcement Learning (RL) offers a promising paradigm for enhancing GUI agents through direct environment interaction. However, its effectiveness is severely hindered by inefficient credit assignment in long-horizon tasks and repetitive errors across tasks due to the lack of experience transfer. To address these challenges, we propose UI-Mem, a novel framework that enhances GUI online RL with a Hierarchical Experience Memory. Unlike traditional replay buffers, our memory accumulates structured knowledge, including high-level workflows, subtask skills, and failure patterns. These experiences are stored as parameterized templates that enable cross-task and cross-application transfer. To effectively integrate memory guidance into online RL, we introduce Stratified Group Sampling, which injects varying levels of guidance across trajectories within each rollout group to maintain outcome diversity, driving the unguided policy toward internalizing guided behaviors. Furthermore, a Self-Evolving Loop continuously abstracts novel strategies and errors to keep the memory aligned with the agent's evolving policy. Experiments on online GUI benchmarks demonstrate that UI-Mem significantly outperforms traditional RL baselines and static reuse strategies, with strong generalization to unseen applications. Project page: https://ui-mem.github.io
