RoNet: Rotation-oriented Continuous Image Translation
Yi Li, Xin Xie, Lina Lei, Haiyan Fu, Yanqing Guo
TL;DR
RoNet tackles continuous multi-domain image translation by representing domain relationships on a learned annular manifold and rotating a style vector within an automatically discovered 2D rotation plane. It jointly learns to disentangle content and style while enforcing semantic consistency through a VSA loss and a patch-based semantic style loss, improving texture realism in challenging forests, faces, and street scenes. The method achieves superior visual quality and continuity compared with multiple baselines, with favorable LPIPS, FID, and KID scores, and supports high-resolution outputs. Overall, RoNet provides a general, end-to-end approach for smooth, cyclic domain translation with a single input image, applicable to seasonal variation, time-of-day shifts, and cross-domain photography styles.
Abstract
The generation of smooth and continuous images between domains has recently drawn much attention in image-to-image (I2I) translation. Linear relationship acts as the basic assumption in most existing approaches, while applied to different aspects including features, models or labels. However, the linear assumption is hard to conform with the element dimension increases and suffers from the limit that having to obtain both ends of the line. In this paper, we propose a novel rotation-oriented solution and model the continuous generation with an in-plane rotation over the style representation of an image, achieving a network named RoNet. A rotation module is implanted in the generation network to automatically learn the proper plane while disentangling the content and the style of an image. To encourage realistic texture, we also design a patch-based semantic style loss that learns the different styles of the similar object in different domains. We conduct experiments on forest scenes (where the complex texture makes the generation very challenging), faces, streetscapes and the iphone2dslr task. The results validate the superiority of our method in terms of visual quality and continuity.
