Diffusion Models for Reinforcement Learning: A Survey
Zhengbang Zhu, Hanye Zhao, Haoran He, Yichao Zhong, Shenyu Zhang, Haoquan Guo, Tingting Chen, Weinan Zhang
TL;DR
This survey examines how diffusion models address core RL challenges such as data efficiency, distribution shift, planning errors, and multitask generalization. It organizes diffusion-RL methods into planners, policies, and data synthesizers, and surveys foundational techniques (DDPM and score-based models) alongside guided and fast sampling. The paper catalogs applications across offline/online RL, imitation learning, trajectory generation, and data augmentation, and highlights future directions including generative environment simulation, safety integration, retrieval-augmented generation, and skill composition. Overall, it positions diffusion models as a unifying framework that enhances expressiveness, data utilization, and planning robustness in RL, while outlining practical avenues for research and development.
Abstract
Diffusion models surpass previous generative models in sample quality and training stability. Recent works have shown the advantages of diffusion models in improving reinforcement learning (RL) solutions. This survey aims to provide an overview of this emerging field and hopes to inspire new avenues of research. First, we examine several challenges encountered by RL algorithms. Then, we present a taxonomy of existing methods based on the roles of diffusion models in RL and explore how the preceding challenges are addressed. We further outline successful applications of diffusion models in various RL-related tasks. Finally, we conclude the survey and offer insights into future research directions. We are actively maintaining a GitHub repository for papers and other related resources in utilizing diffusion models in RL: https://github.com/apexrl/Diff4RLSurvey.
