DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
Jinxin Liu, Xinghong Guo, Zifeng Zhuang, Donglin Wang
TL;DR
DIDI addresses offline behavioral generation from reward-free data with multimodality. It introduces a contextual policy trained with a diffusion prior as regularization to induce diverse skills, using a three-step pseudo-labeling loop to stay within offline data. The method enables skill stitching and interpolation and supports reward-guided generation with extrinsic rewards, showing strong diversity and competitive performance across Push, Kitchen, Humanoid, and D4RL tasks. These results highlight the practical viability of diffusion-guided diversity for learning generalist skill spaces in offline settings and for downstream tasks.
Abstract
In this paper, we propose a novel approach called DIffusion-guided DIversity (DIDI) for offline behavioral generation. The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data. We achieve this by leveraging diffusion probabilistic models as priors to guide the learning process and regularize the policy. By optimizing a joint objective that incorporates diversity and diffusion-guided regularization, we encourage the emergence of diverse behaviors while maintaining the similarity to the offline data. Experimental results in four decision-making domains (Push, Kitchen, Humanoid, and D4RL tasks) show that DIDI is effective in discovering diverse and discriminative skills. We also introduce skill stitching and skill interpolation, which highlight the generalist nature of the learned skill space. Further, by incorporating an extrinsic reward function, DIDI enables reward-guided behavior generation, facilitating the learning of diverse and optimal behaviors from sub-optimal data.
