MGDA: Model-based Goal Data Augmentation for Offline Goal-conditioned Weighted Supervised Learning
Xing Lei, Xuetao Zhang, Donglin Wang
TL;DR
The paper addresses offline goal-conditioned RL and the stitching limitation of GCWSL by introducing a principled model-based data augmentation method, MGDA, that uses a learned dynamics model with local Lipschitz control to generate plausible augmented goals.MGDA is guided by three unified principles—Goal Diversity, Action Optimality, and Goal Reachability—and comes with theoretical guarantees showing it approximates a one-step stitching distribution $p^{1-step}(g|s,a)$ within an error bound $\mathcal{O}(\epsilon_k L_1)$.The approach is validated on state-based and vision-based offline maze datasets, where MGDA-enhanced GCWSL improves stitching performance over existing augmentation methods, with ablations highlighting the critical role of the LLC constraint.Overall, MGDA offers a principled, scalable augmentation framework that enhances trajectory stitching in offline GCWSL and has potential applicability to other goal-conditioned supervised-learning paradigms.
Abstract
Recently, a state-of-the-art family of algorithms, known as Goal-Conditioned Weighted Supervised Learning (GCWSL) methods, has been introduced to tackle challenges in offline goal-conditioned reinforcement learning (RL). GCWSL optimizes a lower bound of the goal-conditioned RL objective and has demonstrated outstanding performance across diverse goal-reaching tasks, providing a simple, effective, and stable solution. However, prior research has identified a critical limitation of GCWSL: the lack of trajectory stitching capabilities. To address this, goal data augmentation strategies have been proposed to enhance these methods. Nevertheless, existing techniques often struggle to sample suitable augmented goals for GCWSL effectively. In this paper, we establish unified principles for goal data augmentation, focusing on goal diversity, action optimality, and goal reachability. Based on these principles, we propose a Model-based Goal Data Augmentation (MGDA) approach, which leverages a learned dynamics model to sample more suitable augmented goals. MGDA uniquely incorporates the local Lipschitz continuity assumption within the learned model to mitigate the impact of compounding errors. Empirical results show that MGDA significantly enhances the performance of GCWSL methods on both state-based and vision-based maze datasets, surpassing previous goal data augmentation techniques in improving stitching capabilities.
