StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation
Sidi Wu, Yizi Chen, Samuel Mermet, Lorenz Hurni, Konrad Schindler, Nicolas Gonthier, Loic Landrieu
TL;DR
StegoGAN tackles non-bijective image-to-image translation by introducing steganography in feature space to prevent hallucination of unmatchable classes. The method explicitly disentangles matchable and unmatchable information through a backward encoder–decoder and a learnable unmatchability mask, with forward translations augmented by unmatchable content only as needed. It adds targeted losses and a mask regularization to ensure sparse, interpretable masks and to enforce matchable-consistency across translations, all without paired supervision. Empirical results on PlanIGN, GoogleMaps, and Brats MRI show improved semantic fidelity and reduced false positives of unmatchable content compared with CycleGAN and other baselines, with ablations highlighting the pivotal role of L_reg and the mask mechanism. This approach advances reliable domain translation in real-world, asymmetric settings and opens avenues for applying steganography-inspired safeguards to other generative translation tasks.
Abstract
Most image-to-image translation models postulate that a unique correspondence exists between the semantic classes of the source and target domains. However, this assumption does not always hold in real-world scenarios due to divergent distributions, different class sets, and asymmetrical information representation. As conventional GANs attempt to generate images that match the distribution of the target domain, they may hallucinate spurious instances of classes absent from the source domain, thereby diminishing the usefulness and reliability of translated images. CycleGAN-based methods are also known to hide the mismatched information in the generated images to bypass cycle consistency objectives, a process known as steganography. In response to the challenge of non-bijective image translation, we introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images. Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision. Our experimental evaluations demonstrate that StegoGAN outperforms existing GAN-based models across various non-bijective image-to-image translation tasks, both qualitatively and quantitatively. Our code and pretrained models are accessible at https://github.com/sian-wusidi/StegoGAN.
