Table of Contents
Fetching ...

Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration

Yuyang Hu, Kangfu Mei, Mojtaba Sahraee-Ardakan, Ulugbek S. Kamilov, Peyman Milanfar, Mauricio Delbracio

TL;DR

Kernel Density Steering (KDS) introduces an inference-time, plug-and-play framework for diffusion-based image restoration that guides an N-particle latent ensemble toward high-density posterior modes using patch-wise KDE gradients. By performing collective mode seeking via a mean-shift-like update, KDS reduces artifacts and improves both distortion and perceptual quality without retraining or degradation-model knowledge. Empirical results across real-world super-resolution and inpainting tasks show consistent gains in PSNR/SSIM and perceptual metrics, along with robustness to hyperparameters. The approach offers a scalable, model-agnostic enhancement to diffusion samplers with practical impact for high-fidelity IR in diverse real-world settings.

Abstract

Diffusion models show promise for image restoration, but existing methods often struggle with inconsistent fidelity and undesirable artifacts. To address this, we introduce Kernel Density Steering (KDS), a novel inference-time framework promoting robust, high-fidelity outputs through explicit local mode-seeking. KDS employs an $N$-particle ensemble of diffusion samples, computing patch-wise kernel density estimation gradients from their collective outputs. These gradients steer patches in each particle towards shared, higher-density regions identified within the ensemble. This collective local mode-seeking mechanism, acting as "collective wisdom", steers samples away from spurious modes prone to artifacts, arising from independent sampling or model imperfections, and towards more robust, high-fidelity structures. This allows us to obtain better quality samples at the expense of higher compute by simultaneously sampling multiple particles. As a plug-and-play framework, KDS requires no retraining or external verifiers, seamlessly integrating with various diffusion samplers. Extensive numerical validations demonstrate KDS substantially improves both quantitative and qualitative performance on challenging real-world super-resolution and image inpainting tasks.

Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration

TL;DR

Kernel Density Steering (KDS) introduces an inference-time, plug-and-play framework for diffusion-based image restoration that guides an N-particle latent ensemble toward high-density posterior modes using patch-wise KDE gradients. By performing collective mode seeking via a mean-shift-like update, KDS reduces artifacts and improves both distortion and perceptual quality without retraining or degradation-model knowledge. Empirical results across real-world super-resolution and inpainting tasks show consistent gains in PSNR/SSIM and perceptual metrics, along with robustness to hyperparameters. The approach offers a scalable, model-agnostic enhancement to diffusion samplers with practical impact for high-fidelity IR in diverse real-world settings.

Abstract

Diffusion models show promise for image restoration, but existing methods often struggle with inconsistent fidelity and undesirable artifacts. To address this, we introduce Kernel Density Steering (KDS), a novel inference-time framework promoting robust, high-fidelity outputs through explicit local mode-seeking. KDS employs an -particle ensemble of diffusion samples, computing patch-wise kernel density estimation gradients from their collective outputs. These gradients steer patches in each particle towards shared, higher-density regions identified within the ensemble. This collective local mode-seeking mechanism, acting as "collective wisdom", steers samples away from spurious modes prone to artifacts, arising from independent sampling or model imperfections, and towards more robust, high-fidelity structures. This allows us to obtain better quality samples at the expense of higher compute by simultaneously sampling multiple particles. As a plug-and-play framework, KDS requires no retraining or external verifiers, seamlessly integrating with various diffusion samplers. Extensive numerical validations demonstrate KDS substantially improves both quantitative and qualitative performance on challenging real-world super-resolution and image inpainting tasks.

Paper Structure

This paper contains 34 sections, 10 equations, 18 figures, 14 tables, 5 algorithms.

Figures (18)

  • Figure 1: Conceptual illustration of Kernel Density Steering (KDS) for diffusion sampling. Left: Standard diffusion sampling with $N$ independent particles (blue dots) can result in high variance. Right: KDS utilizes the ensemble of $N$ particles to estimate local density. It then guides these particles towards shared high-density modes (peaks in the density curve) during the diffusion process.
  • Figure 2: Kernel Density Steering (KDS) sharpens sample distributions in a 2D Mixture of Gaussians (MoG) toy problem. The target distribution $p({\bm{x}}_0)$ consists of three distinct Gaussian modes. Samples are drawn using DDIM with the exact score function (purple dots) versus DDIM augmented with KDS (green dots, $N=50$ particles). KDS guides particles through the reverse diffusion progresses, leading to significantly higher sample concentration at the mode peaks compared to standard DDIM.
  • Figure 3: Qualitative comparison of $4\times$ Real-world Super-Resolution with KDS-enhanced DDIM sampling (Number of Particles $N=10$). 'w/o KDS' shows results from baseline DDIM, while 'w/ KDS' shows results with KDS. KDS consistently produces images with improved sharpness, finer details, and reduced artifacts across LDM-SR, DiffBIR, and SeeSR backbones.
  • Figure 4: Robustness analysis of KDS on the RealSR dataset. Left: Scatter plots comparing the PSNR and LPIPS of the DDIM with KDS versus a worst-performing particle of standard DDIM ensemble ($N=10$). KDS consistently improves the quality of the worst-case samples. Right: Qualitative examples comparing the worst-performing output from a DDIM ensemble with KDS-guided DDIM with the same random seed, demonstrating KDS's superior consistency and artifact reduction.
  • Figure 5: Box-inpainting performance of KDS. KDS generates more coherent and detailed inpainted regions compared to standard DDIM sampling.
  • ...and 13 more figures