Table of Contents
Fetching ...

Towards Interactive Image Inpainting via Sketch Refinement

Chang Liu, Shunxin Xu, Jialun Peng, Kaidong Zhang, Dong Liu

TL;DR

SketchRefiner tackles interactive image inpainting by separating sketch refinement from inpainting, using a two-stage design to robustly leverage user sketches. The SRN employs a Registration Module and Enhancement Module with a novel cross-correlation loss to align and coherently refine sketches, while SIN modulates inpainting through a Partial Sketch Encoder and Sketch Feature Aggregation, feeding a Texture Restoration Module. A Sketch Simulation Algorithm and a real-world sketch-based test protocol are introduced to address data scarcity and evaluate practical performance. Across ImageNet, Places2, and CelebA-HQ, SketchRefiner consistently outperforms state-of-the-art methods in both quantitative metrics (PSNR/SSIM/FID) and perceptual quality, demonstrating strong potential for real-world sketch-guided editing. Limitations include occasional over-refinement and restriction to monochrome sketches, with future work aimed at balancing sketch control and robustness to input randomness.

Abstract

One tough problem of image inpainting is to restore complex structures in the corrupted regions. It motivates interactive image inpainting which leverages additional hints, e.g., sketches, to assist the inpainting process. Sketch is simple and intuitive to end users, but meanwhile has free forms with much randomness. Such randomness may confuse the inpainting models, and incur severe artifacts in completed images. To address this problem, we propose a two-stage image inpainting method termed SketchRefiner. In the first stage, we propose using a cross-correlation loss function to robustly calibrate and refine the user-provided sketches in a coarse-to-fine fashion. In the second stage, we learn to extract informative features from the abstracted sketches in the feature space and modulate the inpainting process. We also propose an algorithm to simulate real sketches automatically and build a test protocol with different applications. Experimental results on public datasets demonstrate that SketchRefiner effectively utilizes sketch information and eliminates the artifacts due to the free-form sketches. Our method consistently outperforms the state-of-the-art ones both qualitatively and quantitatively, meanwhile revealing great potential in real-world applications. Our code and dataset are available.

Towards Interactive Image Inpainting via Sketch Refinement

TL;DR

SketchRefiner tackles interactive image inpainting by separating sketch refinement from inpainting, using a two-stage design to robustly leverage user sketches. The SRN employs a Registration Module and Enhancement Module with a novel cross-correlation loss to align and coherently refine sketches, while SIN modulates inpainting through a Partial Sketch Encoder and Sketch Feature Aggregation, feeding a Texture Restoration Module. A Sketch Simulation Algorithm and a real-world sketch-based test protocol are introduced to address data scarcity and evaluate practical performance. Across ImageNet, Places2, and CelebA-HQ, SketchRefiner consistently outperforms state-of-the-art methods in both quantitative metrics (PSNR/SSIM/FID) and perceptual quality, demonstrating strong potential for real-world sketch-guided editing. Limitations include occasional over-refinement and restriction to monochrome sketches, with future work aimed at balancing sketch control and robustness to input randomness.

Abstract

One tough problem of image inpainting is to restore complex structures in the corrupted regions. It motivates interactive image inpainting which leverages additional hints, e.g., sketches, to assist the inpainting process. Sketch is simple and intuitive to end users, but meanwhile has free forms with much randomness. Such randomness may confuse the inpainting models, and incur severe artifacts in completed images. To address this problem, we propose a two-stage image inpainting method termed SketchRefiner. In the first stage, we propose using a cross-correlation loss function to robustly calibrate and refine the user-provided sketches in a coarse-to-fine fashion. In the second stage, we learn to extract informative features from the abstracted sketches in the feature space and modulate the inpainting process. We also propose an algorithm to simulate real sketches automatically and build a test protocol with different applications. Experimental results on public datasets demonstrate that SketchRefiner effectively utilizes sketch information and eliminates the artifacts due to the free-form sketches. Our method consistently outperforms the state-of-the-art ones both qualitatively and quantitatively, meanwhile revealing great potential in real-world applications. Our code and dataset are available.
Paper Structure (22 sections, 4 equations, 15 figures, 12 tables, 1 algorithm)

This paper contains 22 sections, 4 equations, 15 figures, 12 tables, 1 algorithm.

Figures (15)

  • Figure 1: Interactive image inpainting results of scene editing (top two rows) and face manipulation (the third row), produced by SketchEdit zeng2022sketchedit and our proposed SketchRefiner. Given user-provided sketches and masks (the second and fifth columns), SketchEdit zeng2022sketchedit tries to use the sketches as if they were edges in the missing area, thereby producing noticeable artifacts. In reverse, our method uses the sketches as coarse guiding information, thus reflects user intentions in the produced results and meanwhile tolerates certain randomness of user input. Zoom in for best view.
  • Figure 2: Overview of SketchRefiner (left) and Sketch Feature Aggregation (SFA) block. SketchRefiner consists of Sketch Refinement Network (SRN) and Sketch-modulated Inpainting Network (SIN). To address the misalignment of sketches, we first send sketch $\textbf{S}$, corrupted image $\textbf{I}_m$ and mask $\textbf{M}$ into the Registration Module (RM), which produces a coarsely refined sketch $\textbf{S}_{coarse}$. Then, Enhancement Module (EM) further refines the structural incoherence in $\textbf{S}_{coarse}$ and yields our final refined sketch $\hat{\textbf{S}}$. Here, $\lambda_1$ and $\lambda_2$ are loss weighing hyperparameters. In SIN, Partial Sketch Encoder (PSE) first learns to extract features from $\hat{\textbf{S}}$. Then, Texture Restoration Module (TRM) merges the features as modulation with Sketch Feature Aggregation (SFA) blocks. Eventually, SIN restores the missing texture and produces the inpainted result $\hat{\textbf{I}}$.
  • Figure 3: Visualization of edges and different sketches. (a) Image, (b) detected edge he2020bdcn, (c) sketch provided in TU-Berlin eitz2012hdhso, (d) sketch provided in Sketchy sangkloy2016sketchy, (e) user-drawn sketch in our conducted user study, and (f) simulated sketch with SSA (Algorithm \ref{['SSA']}). Zoom in for best view.
  • Figure 4: Qualitative results of face restoration with synthetic sketches (top two rows) and face editing with user-drawn ones (bottom two rows), compared to (c) DeepFill-v2 yu2018free, (d) DeepPS yang2020deep, (e) LaMa suvorov2022resolution, (f) SketchEdit zeng2022sketchedit, and (g) ZITS dong2022incremental. (b) represents the visualization of masked input with sketch. Zoom in for best view.
  • Figure 5: Qualitative results of scene editing with user-drawn sketches, compared to (c) DeepFill-v2 yu2018free, (d) LaMa suvorov2022resolution, (e) SketchEdit zeng2022sketchedit, and (f) ZITS dong2022incremental. (b) represents the visualization of masked input with sketch. Zoom in for best view.
  • ...and 10 more figures