DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion
Puyu Han, Jiaju Kang, Yuhang Pan, Erting Pan, Zeyu Zhang, Qunchao Jin, Juntao Jiang, Zhichen Liu, Luqi Gong
TL;DR
DiffuMural addresses the challenging task of restoring large-scale Dunhuang murals with limited training data by introducing a contour-guided, multi-scale diffusion framework. It integrates contour extraction, conditional guidance, mural-spatial attention, dynamic cross-scale diffusion, and frequency-domain optimization to produce coherent, detail-rich restorations while aligning with cultural and historical significance through expert evaluations. The approach outperforms state-of-the-art methods on both automated indices (e.g., SSIM, ECON) and qualitative assessments, demonstrating practical potential for digital heritage preservation. By training on a curated set of 23 murals and incorporating human-centered evaluation, DiffuMural offers a scalable, ethically attuned tool for digital restoration and conservation planning.
Abstract
Large-scale pre-trained diffusion models have produced excellent results in the field of conditional image generation. However, restoration of ancient murals, as an important downstream task in this field, poses significant challenges to diffusion model-based restoration methods due to its large defective area and scarce training samples. Conditional restoration tasks are more concerned with whether the restored part meets the aesthetic standards of mural restoration in terms of overall style and seam detail, and such metrics for evaluating heuristic image complements are lacking in current research. We therefore propose DiffuMural, a combined Multi-scale convergence and Collaborative Diffusion mechanism with ControlNet and cyclic consistency loss to optimise the matching between the generated images and the conditional control. DiffuMural demonstrates outstanding capabilities in mural restoration, leveraging training data from 23 large-scale Dunhuang murals that exhibit consistent visual aesthetics. The model excels in restoring intricate details, achieving a coherent overall appearance, and addressing the unique challenges posed by incomplete murals lacking factual grounding. Our evaluation framework incorporates four key metrics to quantitatively assess incomplete murals: factual accuracy, textural detail, contextual semantics, and holistic visual coherence. Furthermore, we integrate humanistic value assessments to ensure the restored murals retain their cultural and artistic significance. Extensive experiments validate that our method outperforms state-of-the-art (SOTA) approaches in both qualitative and quantitative metrics.
