Table of Contents
Fetching ...

Improving Detail in Pluralistic Image Inpainting with Feature Dequantization

Kyungri Park, Woohwan Jung

TL;DR

The paper tackles detail degradation in VQGAN-based pluralistic image inpainting caused by feature quantization in the codebook. It introduces the Feature Dequantization Module (FDM), which estimates and corrects the quantization error to restore fine details, paired with an efficient training approach that avoids costly sampling. Empirical results on Places and Paris Street View show that FDM improves detail, texture consistency, and structural fidelity while incurring negligible overhead, and the method also generalizes to other image generation tasks with improved FID. This work significantly enhances the practical quality and applicability of VQGAN-based PII and related generation tasks, with minimal computational trade-offs.

Abstract

Pluralistic Image Inpainting (PII) offers multiple plausible solutions for restoring missing parts of images and has been successfully applied to various applications including image editing and object removal. Recently, VQGAN-based methods have been proposed and have shown that they significantly improve the structural integrity in the generated images. Nevertheless, the state-of-the-art VQGAN-based model PUT faces a critical challenge: degradation of detail quality in output images due to feature quantization. Feature quantization restricts the latent space and causes information loss, which negatively affects the detail quality essential for image inpainting. To tackle the problem, we propose the FDM (Feature Dequantization Module) specifically designed to restore the detail quality of images by compensating for the information loss. Furthermore, we develop an efficient training method for FDM which drastically reduces training costs. We empirically demonstrate that our method significantly enhances the detail quality of the generated images with negligible training and inference overheads.

Improving Detail in Pluralistic Image Inpainting with Feature Dequantization

TL;DR

The paper tackles detail degradation in VQGAN-based pluralistic image inpainting caused by feature quantization in the codebook. It introduces the Feature Dequantization Module (FDM), which estimates and corrects the quantization error to restore fine details, paired with an efficient training approach that avoids costly sampling. Empirical results on Places and Paris Street View show that FDM improves detail, texture consistency, and structural fidelity while incurring negligible overhead, and the method also generalizes to other image generation tasks with improved FID. This work significantly enhances the practical quality and applicability of VQGAN-based PII and related generation tasks, with minimal computational trade-offs.

Abstract

Pluralistic Image Inpainting (PII) offers multiple plausible solutions for restoring missing parts of images and has been successfully applied to various applications including image editing and object removal. Recently, VQGAN-based methods have been proposed and have shown that they significantly improve the structural integrity in the generated images. Nevertheless, the state-of-the-art VQGAN-based model PUT faces a critical challenge: degradation of detail quality in output images due to feature quantization. Feature quantization restricts the latent space and causes information loss, which negatively affects the detail quality essential for image inpainting. To tackle the problem, we propose the FDM (Feature Dequantization Module) specifically designed to restore the detail quality of images by compensating for the information loss. Furthermore, we develop an efficient training method for FDM which drastically reduces training costs. We empirically demonstrate that our method significantly enhances the detail quality of the generated images with negligible training and inference overheads.

Paper Structure

This paper contains 13 sections, 7 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: An example of the visible boundary between the generated area and the masked image caused by feature quantization. Although the generated area in the center image appears plausible and realistic, a slight color mismatch at the boundary makes it noticeable when combined with the masked image (on the right).
  • Figure 2: An overview of the proposed method. Quantized features are limited to distinct points, represented by green, red, and blue. However, through the proposed FDM module, dequantization expands the representation to a continuous space.
  • Figure 3: An overview of the training procedure.
  • Figure 4: Detail comparison between proposed method and PUT. More results are presented in the supplementary material.
  • Figure 5: Visual comparison of diverse inpainting results in Places and Paris Street View.
  • ...and 3 more figures