Table of Contents
Fetching ...

ARIN: Adaptive Resampling and Instance Normalization for Robust Blind Inpainting of Dunhuang Cave Paintings

Alexander Schmidt, Prathmesh Madhu, Andreas Maier, Vincent Christlein, Ronak Kosti

TL;DR

This work targets robust blind inpainting for culturally important Dunhuang cave paintings. It combines Content Adaptive Resampling (CAR) with Half Instance Normalization (HINet) into Adaptive Resampled Instance Normalization (ARIN), enabling end-to-end restoration that is resilient to Gaussian noise and JPEG artifacts. Through transfer learning on pretrained CAR and HINet models and a dedicated ARIN module, the approach achieves state-competitive performance on the Dunhuang Challenge dataset, with HINet-DB often leading in metrics and ARIN offering strong robustness to real-world degradations. The results demonstrate the practical value of adaptive downsampling and denoising in restoring historical murals while preserving perceptual quality, with potential for end-to-end training in future work.

Abstract

Image enhancement algorithms are very useful for real world computer vision tasks where image resolution is often physically limited by the sensor size. While state-of-the-art deep neural networks show impressive results for image enhancement, they often struggle to enhance real-world images. In this work, we tackle a real-world setting: inpainting of images from Dunhuang caves. The Dunhuang dataset consists of murals, half of which suffer from corrosion and aging. These murals feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns designed by different artists spanning ten centuries, which makes manual restoration challenging. We modify two different existing methods (CAR, HINet) that are based upon state-of-the-art (SOTA) super resolution and deblurring networks. We show that those can successfully inpaint and enhance these deteriorated cave paintings. We further show that a novel combination of CAR and HINet, resulting in our proposed inpainting network (ARIN), is very robust to external noise, especially Gaussian noise. To this end, we present a quantitative and qualitative comparison of our proposed approach with existing SOTA networks and winners of the Dunhuang challenge. One of the proposed methods HINet) represents the new state of the art and outperforms the 1st place of the Dunhuang Challenge, while our combination ARIN, which is robust to noise, is comparable to the 1st place. We also present and discuss qualitative results showing the impact of our method for inpainting on Dunhuang cave images.

ARIN: Adaptive Resampling and Instance Normalization for Robust Blind Inpainting of Dunhuang Cave Paintings

TL;DR

This work targets robust blind inpainting for culturally important Dunhuang cave paintings. It combines Content Adaptive Resampling (CAR) with Half Instance Normalization (HINet) into Adaptive Resampled Instance Normalization (ARIN), enabling end-to-end restoration that is resilient to Gaussian noise and JPEG artifacts. Through transfer learning on pretrained CAR and HINet models and a dedicated ARIN module, the approach achieves state-competitive performance on the Dunhuang Challenge dataset, with HINet-DB often leading in metrics and ARIN offering strong robustness to real-world degradations. The results demonstrate the practical value of adaptive downsampling and denoising in restoring historical murals while preserving perceptual quality, with potential for end-to-end training in future work.

Abstract

Image enhancement algorithms are very useful for real world computer vision tasks where image resolution is often physically limited by the sensor size. While state-of-the-art deep neural networks show impressive results for image enhancement, they often struggle to enhance real-world images. In this work, we tackle a real-world setting: inpainting of images from Dunhuang caves. The Dunhuang dataset consists of murals, half of which suffer from corrosion and aging. These murals feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns designed by different artists spanning ten centuries, which makes manual restoration challenging. We modify two different existing methods (CAR, HINet) that are based upon state-of-the-art (SOTA) super resolution and deblurring networks. We show that those can successfully inpaint and enhance these deteriorated cave paintings. We further show that a novel combination of CAR and HINet, resulting in our proposed inpainting network (ARIN), is very robust to external noise, especially Gaussian noise. To this end, we present a quantitative and qualitative comparison of our proposed approach with existing SOTA networks and winners of the Dunhuang challenge. One of the proposed methods HINet) represents the new state of the art and outperforms the 1st place of the Dunhuang Challenge, while our combination ARIN, which is robust to noise, is comparable to the 1st place. We also present and discuss qualitative results showing the impact of our method for inpainting on Dunhuang cave images.
Paper Structure (15 sections, 3 equations, 8 figures, 3 tables)

This paper contains 15 sections, 3 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: (Left) Damaged mural painting due to aging, (Center) Output when the left image is tested with a CAR-DF2K model, (Right) Inpainted image with the proposed method
  • Figure 2: Sample image from Dunhuang cave. Left: Damaged mural painting due to aging, Right: Partial, manual restoration.
  • Figure 3: An example from Dunhuang Challenge dataset. \ref{['fig:clean']} shows a clean image; \ref{['fig:deterioration']} shows a mask that defines one random artifical deterioration; and \ref{['fig:deteriorated']} shows the artificial deterioration applied on clean image. The challenge considers (\ref{['fig:deterioration']}, \ref{['fig:deteriorated']}) for training to recover \ref{['fig:clean']}
  • Figure 4: Model Architecture of our proposed ARIN network.
  • Figure 5: An example image of raining as noise and output of pretrained HINet-DR model chen2021hinet.
  • ...and 3 more figures