Learning to Manipulate Artistic Images
Wei Guo, Yuqi Zhang, De Ma, Qian Zheng
TL;DR
This work tackles zero-shot manipulation of artistic images without relying on semantic inputs, addressing cross-domain artifacts and imprecise local details in prior exemplar-based methods. It introduces SIM-Net, a dual-branch framework with a Mask-Based Correspondence Network that operates on semantic-free masks $y_A$ and exemplar guidance $y_B$, producing full-resolution warp fields $\omega^k$ via a Dilating Module to guide region-wise transformation. A Translation Network then merges warped regions through a region transportation strategy and refines results using a Texture-Guidance Module that forms a pseudo ground truth with a single warp $\omega_{\mathcal{S}}$, supervised by self-supervised losses including $\mathcal{L}_{bound}$, $\mathcal{L}_{context}$, and $\mathcal{L}_{cyc}$ within the total loss $\mathcal{L}_{total}$. Experiments across 237 artistic images and 10 styles show that SIM-Net achieves competitive style fidelity and high-quality, artifact-free manipulations with efficient computation, highlighting its practical impact for flexible, high-resolution artistic image editing without extensive style-specific training.
Abstract
Recent advancement in computer vision has significantly lowered the barriers to artistic creation. Exemplar-based image translation methods have attracted much attention due to flexibility and controllability. However, these methods hold assumptions regarding semantics or require semantic information as the input, while accurate semantics is not easy to obtain in artistic images. Besides, these methods suffer from cross-domain artifacts due to training data prior and generate imprecise structure due to feature compression in the spatial domain. In this paper, we propose an arbitrary Style Image Manipulation Network (SIM-Net), which leverages semantic-free information as guidance and a region transportation strategy in a self-supervised manner for image generation. Our method balances computational efficiency and high resolution to a certain extent. Moreover, our method facilitates zero-shot style image manipulation. Both qualitative and quantitative experiments demonstrate the superiority of our method over state-of-the-art methods.Code is available at https://github.com/SnailForce/SIM-Net.
