Controlling Neural Style Transfer with Deep Reinforcement Learning
Chengming Feng, Jing Hu, Xin Wang, Shu Hu, Bin Zhu, Xi Wu, Hongtu Zhu, Siwei Lyu
TL;DR
This paper addresses the challenge of controllable stylization in Neural Style Transfer by introducing RL-NST, a reinforcement-learning–based framework that decomposes NST into step-wise, progressive decisions. An actor-critic architecture samples 2D latent actions to steer a stylizer, enabling flexible, user-controlled stylization while maintaining content fidelity and reducing computation relative to one-step DL models. The method combines perceptual content and style losses with temporal regularization to support both image and video NST, and validates its effectiveness through extensive experiments and ablations, showing improvements in stability, quality, and efficiency. The work advances NST by integrating progressive RL control, offering practical benefits for real-time or resource-constrained applications in art-style transfer and video stylization.
Abstract
Controlling the degree of stylization in the Neural Style Transfer (NST) is a little tricky since it usually needs hand-engineering on hyper-parameters. In this paper, we propose the first deep Reinforcement Learning (RL) based architecture that splits one-step style transfer into a step-wise process for the NST task. Our RL-based method tends to preserve more details and structures of the content image in early steps, and synthesize more style patterns in later steps. It is a user-easily-controlled style-transfer method. Additionally, as our RL-based model performs the stylization progressively, it is lightweight and has lower computational complexity than existing one-step Deep Learning (DL) based models. Experimental results demonstrate the effectiveness and robustness of our method.
