Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models
Georgios Tziafas, Hamidreza Kasaei
TL;DR
This paper addresses the challenge of lifelong manipulation with language grounding by introducing LRLL, a memory-augmented, gradient-free agent that continuously grows a library of composable robot skills. It combines wake-sleep cycles, a soft experience memory, and a skill abstraction module to distill past interactions into new Python-based primitives, enabling scalable, interpretable policies without fine-tuning. In simulation, LRLL outperforms end-to-end and static-LLM baselines and demonstrates transferring learned skills to real-world dual-arm manipulation, highlighting improvements in generalization, memory efficiency, and avoidance of catastrophic forgetting. The approach paves the way for scalable, human-in-the-loop, language-grounded robotic systems that can autonomously expand their capabilities without gradient-based optimization, while pointing to future work in multimodal perception and faster, cheaper LLMs for practical deployment.
Abstract
Large Language Models (LLMs) have emerged as a new paradigm for embodied reasoning and control, most recently by generating robot policy code that utilizes a custom library of vision and control primitive skills. However, prior arts fix their skills library and steer the LLM with carefully hand-crafted prompt engineering, limiting the agent to a stationary range of addressable tasks. In this work, we introduce LRLL, an LLM-based lifelong learning agent that continuously grows the robot skill library to tackle manipulation tasks of ever-growing complexity. LRLL achieves this with four novel contributions: 1) a soft memory module that allows dynamic storage and retrieval of past experiences to serve as context, 2) a self-guided exploration policy that proposes new tasks in simulation, 3) a skill abstractor that distills recent experiences into new library skills, and 4) a lifelong learning algorithm for enabling human users to bootstrap new skills with minimal online interaction. LRLL continuously transfers knowledge from the memory to the library, building composable, general and interpretable policies, while bypassing gradient-based optimization, thus relieving the learner from catastrophic forgetting. Empirical evaluation in a simulated tabletop environment shows that LRLL outperforms end-to-end and vanilla LLM approaches in the lifelong setup while learning skills that are transferable to the real world. Project material will become available at the webpage https://gtziafas.github.io/LRLL_project.
