KAO: Kernel-Adaptive Optimization in Diffusion for Satellite Image
Teerapong Panboonyuen
TL;DR
KAO introduces Kernel-Adaptive Optimization within diffusion models to perform high-fidelity satellite image inpainting on very high-resolution data. By combining latent-space conditioning with Explicit Propagation and integrating with a Token Pyramid Transformer, KAO adaptively modulates diffusion kernels and propagates information across scales to prioritize structurally important regions. Empirical results on Massachusetts Roads and DeepGlobe show state-of-the-art performance in FID, precision, and recall, while reducing computational cost compared to baselines. The method demonstrates robust restoration under cloud/mist occlusions and offers practical strategies for efficient deployment in remote sensing workflows. This work advances satellite image restoration by uniting kernel-adaptive denoising with hierarchical latent representations for scalable, high-quality inpainting.
Abstract
Satellite image inpainting is a crucial task in remote sensing, where accurately restoring missing or occluded regions is essential for robust image analysis. In this paper, we propose KAO, a novel framework that utilizes Kernel-Adaptive Optimization within diffusion models for satellite image inpainting. KAO is specifically designed to address the challenges posed by very high-resolution (VHR) satellite datasets, such as DeepGlobe and the Massachusetts Roads Dataset. Unlike existing methods that rely on preconditioned models requiring extensive retraining or postconditioned models with significant computational overhead, KAO introduces a Latent Space Conditioning approach, optimizing a compact latent space to achieve efficient and accurate inpainting. Furthermore, we incorporate Explicit Propagation into the diffusion process, facilitating forward-backward fusion, which improves the stability and precision of the method. Experimental results demonstrate that KAO sets a new benchmark for VHR satellite image restoration, providing a scalable, high-performance solution that balances the efficiency of preconditioned models with the flexibility of postconditioned models.
