Deep Spectral Prior
Yanqi Cheng, Xuxiang Zhao, Tieyong Zeng, Pietro Lio, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero
TL;DR
The paper introduces the Deep Spectral Prior (DSP), a frequency-domain unsupervised framework for image reconstruction that operates in the complex domain to learn amplitude and phase directly. It proves that the DSP loss is equivalent to the pixel-domain objective under a unitary Fourier transform, but yields different, more stable descent dynamics than DIP, including a spectral stability law that orders convergence by frequency and eliminates the need for early stopping. Through NTK-based analysis and spectral decompositions, the authors show how DSP progressively recovers low-frequency content while suppressing high-frequency noise, effectively acting as an implicit frequency-domain regulariser. Empirically, DSP outperforms DIP and other baselines across denoising, inpainting, deblurring, restoration, and super-resolution, demonstrating improved fidelity, robustness, and interpretability in a data-free setting. The work presents a unified frequency-based perspective on implicit priors, with strong theoretical and practical implications for single-image reconstruction tasks.
Abstract
We introduce the Deep Spectral Prior (DSP), a new framework for unsupervised image reconstruction that operates entirely in the complex frequency domain. Unlike the Deep Image Prior (DIP), which optimises pixel-level errors and is highly sensitive to overfitting, DSP performs joint learning of amplitude and phase to capture the full spectral structure of images. We derive a rigorous theoretical characterisation of DSP's optimisation dynamics, proving that it follows frequency-dependent descent trajectories that separate informative low-frequency modes from stochastic high-frequency noise. This spectral mode separation explains DSP's self-regularising behaviour and, for the first time, formally establishes the elimination of DIP's major limitation-its reliance on manual early stopping. Moreover, DSP induces an implicit projection onto a frequency-consistent manifold, ensuring convergence to stable, physically plausible reconstructions without explicit priors or supervision. Extensive experiments on denoising, inpainting, and deblurring demonstrate that DSP consistently surpasses DIP and other unsupervised baselines, achieving superior fidelity, robustness, and theoretical interpretability within a unified, unsupervised data-free framework.
