Unpaired Image-to-Image Translation via Neural Schrödinger Bridge
Beomsu Kim, Gihyun Kwon, Kwanyoung Kim, Jong Chul Ye
TL;DR
Diffusion models are powerful but constrained by Gaussian priors for unpaired image-to-image translation. The authors introduce Unpaired Neural Schrödinger Bridge (UNSB), which reframes Schrödinger Bridges as a sequence of adversarial transport problems with a KL-divergence constraint and uses a time-conditioned generator to learn a chain of conditional mappings. They diagnose the curse of dimensionality as the core bottleneck for SB in high dimensions and validate UNSB through toy sanity checks and large-scale I2I benchmarks (e.g., Horse2Zebra, Summer2Winter, Map2Satellite), where it outperforms GAN- and diffusion-based baselines. This approach enables scalable, multi-step SB-based translation and suggests a new direction for applying diffusion-style models to unpaired, high-resolution image translation tasks.
Abstract
Diffusion models are a powerful class of generative models which simulate stochastic differential equations (SDEs) to generate data from noise. While diffusion models have achieved remarkable progress, they have limitations in unpaired image-to-image (I2I) translation tasks due to the Gaussian prior assumption. Schrödinger Bridge (SB), which learns an SDE to translate between two arbitrary distributions, have risen as an attractive solution to this problem. Yet, to our best knowledge, none of SB models so far have been successful at unpaired translation between high-resolution images. In this work, we propose Unpaired Neural Schrödinger Bridge (UNSB), which expresses the SB problem as a sequence of adversarial learning problems. This allows us to incorporate advanced discriminators and regularization to learn a SB between unpaired data. We show that UNSB is scalable and successfully solves various unpaired I2I translation tasks. Code: \url{https://github.com/cyclomon/UNSB}
