The Duality of Generative AI and Reinforcement Learning in Robotics: A Review
Angelo Moroncelli, Vishal Soni, Marco Forgione, Dario Piga, Blerina Spahiu, Loris Roveda
TL;DR
This review analyzes the intersection of generative AI and reinforcement learning (RL) in robotics, focusing on how transformer- and diffusion-based models can act as priors and policies within RL systems. It presents a dual taxonomy: Generative Tools for RL (models like LLMs, VLMs, diffusion, world models, and VPMs used in RL pipelines) and RL for Generative Policies (pre-training, fine-tuning, and distillation of generative policies). The paper synthesizes evidence across three core RL tasks—Reward Signal, State Representation, and Planning & Exploration—illustrating how each generative tool contributes to performance, generalization, and data efficiency, while also addressing safety and grounding challenges. It concludes with practical recommendations and future directions, including RL from human feedback, actor-critic foundation models, and constraint-aware control, to realize scalable and safe robot learning with generative models.
Abstract
Recently, generative AI and reinforcement learning (RL) have been redefining what is possible for AI agents that take information flows as input and produce intelligent behavior. As a result, we are seeing similar advancements in embodied AI and robotics for control policy generation. Our review paper examines the integration of generative AI models with RL to advance robotics. Our primary focus is on the duality between generative AI and RL for robotics downstream tasks. Specifically, we investigate: (1) The role of prominent generative AI tools as modular priors for multi-modal input fusion in RL tasks. (2) How RL can train, fine-tune and distill generative models for policy generation, such as VLA models, similarly to RL applications in large language models. We then propose a new taxonomy based on a considerable amount of selected papers. Lastly, we identify open challenges accounting for model scalability, adaptation and grounding, giving recommendations and insights on future research directions. We reflect on which generative AI models best fit the RL tasks and why. On the other side, we reflect on important issues inherent to RL-enhanced generative policies, such as safety concerns and failure modes, and what are the limitations of current methods. A curated collection of relevant research papers is maintained on our GitHub repository, serving as a resource for ongoing research and development in this field: https://github.com/clmoro/Robotics-RL-FMs-Integration.
