Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges
Giorgio Franceschelli, Mirco Musolesi
TL;DR
The paper addresses how Reinforcement Learning can enhance generative AI by providing non-differentiable rewards, alignment with human values, and flexible objective formulations. It introduces a three-way taxonomy—generation with RL as an alternative, objective-driven generation, and shaping non-quantifiable traits via reward modeling—and surveys state-of-the-art methods, including SeqGAN, RLHF, reward modeling, MIXER-like strategies, diffusion-policy optimization, and molecular design via RL. Key contributions include a structured synthesis of existing work, a discussion of domain-specific rewards, and a candid assessment of challenges such as sparse rewards, reward hacking, and the cost of human feedback, with proposed directions like IRL, offline RL, and multi-agent RL. The survey aims to guide researchers and practitioners toward practical integration of RL in generative systems, highlighting the potential impact on text, code, music, image, and chemistry domains while noting significant open research questions and methodological gaps.
Abstract
Generative Artificial Intelligence (AI) is one of the most exciting developments in Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has emerged as a very successful paradigm for a variety of machine learning tasks. In this survey, we discuss the state of the art, opportunities and open research questions in applying RL to generative AI. In particular, we will discuss three types of applications, namely, RL as an alternative way for generation without specified objectives; as a way for generating outputs while concurrently maximizing an objective function; and, finally, as a way of embedding desired characteristics, which cannot be easily captured by means of an objective function, into the generative process. We conclude the survey with an in-depth discussion of the opportunities and challenges in this fascinating emerging area.
