WINE: Wavelet-Guided GAN Inversion and Editing for High-Fidelity Refinement
Chaewon Kim, Seung-Jun Moon, Gyeong-Moon Park
TL;DR
WINE tackles the persistent low-frequency bias in GAN inversion by introducing a frequency-domain approach that explicitly preserves high-frequency details. It combines a wavelet loss targeting high-frequency subbands with a wavelet fusion mechanism to transfer high-frequency information into the generator, enabling high-fidelity inversion and robust editing. The method demonstrates superior reconstruction quality and editability over state-of-the-art baselines across multiple datasets, supported by ablations and theoretical insights into sub-band information content. This frequency-aware framework has practical implications for more accurate image restoration and editing in GAN-based pipelines, and may generalize to other wavelet-augmented generators. The work thus advances high-fidelity inversion and editing by bridging spatial and spectral information through wavelet analysis.
Abstract
Recent advanced GAN inversion models aim to convey high-fidelity information from original images to generators through methods using generator tuning or high-dimensional feature learning. Despite these efforts, accurately reconstructing image-specific details remains as a challenge due to the inherent limitations both in terms of training and structural aspects, leading to a bias towards low-frequency information. In this paper, we look into the widely used pixel loss in GAN inversion, revealing its predominant focus on the reconstruction of low-frequency features. We then propose WINE, a Wavelet-guided GAN Inversion aNd Editing model, which transfers the high-frequency information through wavelet coefficients via newly proposed wavelet loss and wavelet fusion scheme. Notably, WINE is the first attempt to interpret GAN inversion in the frequency domain. Our experimental results showcase the precision of WINE in preserving high-frequency details and enhancing image quality. Even in editing scenarios, WINE outperforms existing state-of-the-art GAN inversion models with a fine balance between editability and reconstruction quality.
