Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting
Jacob Fein-Ashley, Benjamin Fein-Ashley
TL;DR
The paper tackles the challenge of realistic image inpainting, especially for large missing regions where preserving structure and texture is difficult. It combines diffusion-based inpainting with anisotropic Gaussian splatting, where missing regions are modeled using gradient-adaptive Gaussians whose multi-scale splat maps guide the diffusion process. Key contributions include the anisotropic Gaussian modeling, integration of splat guidance into a diffusion inpainting network, multi-scale splatting, and comprehensive experiments showing superior fidelity and structural coherence on CIFAR-10 and CelebA. The approach demonstrates that explicit structural priors can substantially improve inpainting quality, with potential impact on photo editing, restoration, and occlusion handling in vision systems.
Abstract
Image inpainting is a fundamental task in computer vision, aiming to restore missing or corrupted regions in images realistically. While recent deep learning approaches have significantly advanced the state-of-the-art, challenges remain in maintaining structural continuity and generating coherent textures, particularly in large missing areas. Diffusion models have shown promise in generating high-fidelity images but often lack the structural guidance necessary for realistic inpainting. We propose a novel inpainting method that combines diffusion models with anisotropic Gaussian splatting to capture both local structures and global context effectively. By modeling missing regions using anisotropic Gaussian functions that adapt to local image gradients, our approach provides structural guidance to the diffusion-based inpainting network. The Gaussian splat maps are integrated into the diffusion process, enhancing the model's ability to generate high-fidelity and structurally coherent inpainting results. Extensive experiments demonstrate that our method outperforms state-of-the-art techniques, producing visually plausible results with enhanced structural integrity and texture realism.
