Towards General Purpose Robots at Scale: Lifelong Learning and Learning to Use Memory
William Yue
TL;DR
The paper tackles the challenge of deploying general-purpose robots that operate over long time horizons by addressing memory and lifelong learning. It introduces t-DGR, a non-autoregressive trajectory-based deep generative replay method that achieves state-of-the-art performance on Continual World benchmarks, and AttentionTuner, a memory-guided learning framework that teaches Transformer-based agents to utilize memory via memory dependency pairs annotated by humans. Through comprehensive experiments on Continual World and memory-demanding LTMB tasks, the authors demonstrate that t-DGR mitigates catastrophic forgetting and that AttentionTuner improves long-term credit assignment and generalization, even with very sparse annotations. The work underscores the practical viability of integrating scalable continual learning with memory-aware imitation learning to enable robots to learn and operate effectively in unstructured real-world environments. Together, these approaches advance the goal of scalable robot deployment by enabling durable learning and efficient memory use across extended deployments.
Abstract
The widespread success of artificial intelligence in fields like natural language processing and computer vision has not yet fully transferred to robotics, where progress is hindered by the lack of large-scale training data and the complexity of real-world tasks. To address this, many robot learning researchers are pushing to get robots deployed at scale in everyday unstructured environments like our homes to initiate a data flywheel. While current robot learning systems are effective for certain short-horizon tasks, they are not designed to autonomously operate over long time horizons in unstructured environments. This thesis focuses on addressing two key challenges for robots operating over long time horizons: memory and lifelong learning. We propose two novel methods to advance these capabilities. First, we introduce t-DGR, a trajectory-based deep generative replay method that achieves state-of-the-art performance on Continual World benchmarks, advancing lifelong learning. Second, we develop a framework that leverages human demonstrations to teach agents effective memory utilization, improving learning efficiency and success rates on Memory Gym tasks. Finally, we discuss future directions for achieving the lifelong learning and memory capabilities necessary for robots to function at scale in real-world settings.
