DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration
Nan Gao, Jia Li, Huaibo Huang, Zhi Zeng, Ke Shang, Shuwu Zhang, Ran He
TL;DR
DiffMAC tackles the challenge of high-generalization blind face restoration across diverse, out-of-domain degradations by introducing a diffusion-information-diffusion (DID) framework. It couples a Stage I AdaIN-guided diffusion that aligns the low-quality face to a stable HQ manifold with a Stage II manifold information bottleneck (MIB) that compresses restoration-relevant information while injecting identity cues, all operating within a finetuned latent diffusion backbone (Stable Diffusion v2.1) and a shared VAE. The approach is validated on synthetic and real-world datasets, demonstrating superior fidelity and consistency in both photorealistic and heterogeneous domains, and is complemented by ablations and a user study confirming perceptual benefits. The key contributions include (1) the high-generalization DID framework, (2) the novel manifold information bottleneck module, and (3) evidence of competitive performance without reliance on banked priors, enabling robust BFR across diverse scenes with controllable identity preservation. Overall, DiffMAC provides a practical, scalable solution for real-world blind face restoration with improved generalization and interpretability through information-theoretic constraints on the diffusion manifold.
Abstract
Blind face restoration (BFR) is a highly challenging problem due to the uncertainty of degradation patterns. Current methods have low generalization across photorealistic and heterogeneous domains. In this paper, we propose a Diffusion-Information-Diffusion (DID) framework to tackle diffusion manifold hallucination correction (DiffMAC), which achieves high-generalization face restoration in diverse degraded scenes and heterogeneous domains. Specifically, the first diffusion stage aligns the restored face with spatial feature embedding of the low-quality face based on AdaIN, which synthesizes degradation-removal results but with uncontrollable artifacts for some hard cases. Based on Stage I, Stage II considers information compression using manifold information bottleneck (MIB) and finetunes the first diffusion model to improve facial fidelity. DiffMAC effectively fights against blind degradation patterns and synthesizes high-quality faces with attribute and identity consistencies. Experimental results demonstrate the superiority of DiffMAC over state-of-the-art methods, with a high degree of generalization in real-world and heterogeneous settings. The source code and models will be public.
