LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis
Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Bo Dai, Deli Zhao, Qifeng Chen
TL;DR
LinkGAN addresses the lack of explicit latent-to-pixel linkage in GANs by introducing a regularizer that partitions latent codes and image regions, enforcing that each latent subspace controls a corresponding image region. The method enables precise local edits for both 2D and 3D-aware generation, including fixed and semantic regions and multiple-region configurations, while remaining compatible with GAN inversion. Empirical results on FFHQ, AFHQ, LSUN-Church, and LSUN-Car show improved local controllability with only modest degradation in synthesis quality, and ablations indicate effective control with around 64 axes per linked region. This approach advances spatial controllability in GANs and opens pathways for real-image editing and region-aware synthesis without extensive architectural changes.
Abstract
This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to a set of pixels in the synthesized image. Establishing such a connection facilitates a more convenient local control of GAN generation, where users can alter the image content only within a spatial area simply by partially resampling the latent code. Experimental results confirm four appealing properties of our regularizer, which we call LinkGAN. (1) The latent-pixel linkage is applicable to either a fixed region (\textit{i.e.}, same for all instances) or a particular semantic category (i.e., varying across instances), like the sky. (2) Two or multiple regions can be independently linked to different latent axes, which further supports joint control. (3) Our regularizer can improve the spatial controllability of both 2D and 3D-aware GAN models, barely sacrificing the synthesis performance. (4) The models trained with our regularizer are compatible with GAN inversion techniques and maintain editability on real images.
