Improving Detail in Pluralistic Image Inpainting with Feature Dequantization
Kyungri Park, Woohwan Jung
TL;DR
The paper tackles detail degradation in VQGAN-based pluralistic image inpainting caused by feature quantization in the codebook. It introduces the Feature Dequantization Module (FDM), which estimates and corrects the quantization error to restore fine details, paired with an efficient training approach that avoids costly sampling. Empirical results on Places and Paris Street View show that FDM improves detail, texture consistency, and structural fidelity while incurring negligible overhead, and the method also generalizes to other image generation tasks with improved FID. This work significantly enhances the practical quality and applicability of VQGAN-based PII and related generation tasks, with minimal computational trade-offs.
Abstract
Pluralistic Image Inpainting (PII) offers multiple plausible solutions for restoring missing parts of images and has been successfully applied to various applications including image editing and object removal. Recently, VQGAN-based methods have been proposed and have shown that they significantly improve the structural integrity in the generated images. Nevertheless, the state-of-the-art VQGAN-based model PUT faces a critical challenge: degradation of detail quality in output images due to feature quantization. Feature quantization restricts the latent space and causes information loss, which negatively affects the detail quality essential for image inpainting. To tackle the problem, we propose the FDM (Feature Dequantization Module) specifically designed to restore the detail quality of images by compensating for the information loss. Furthermore, we develop an efficient training method for FDM which drastically reduces training costs. We empirically demonstrate that our method significantly enhances the detail quality of the generated images with negligible training and inference overheads.
