Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction
Shijun Liang, Evan Bell, Qing Qu, Rongrong Wang, Saiprasad Ravishankar
TL;DR
This work analyzes why Deep Image Prior (DIP) can reconstruct images from undersampled measurements and how training dynamics in the neural tangent kernel (NTK) regime interact with forward operators such as those in MRI. It derives conditions under which DIP can overfit or fail to recover high-frequency content, and proposes a self-guided DIP that jointly optimizes network weights and input with a denoiser-based regularizer, removing the need for training data or reference images. The authors demonstrate that self-guided DIP outperforms vanilla DIP, reference-guided DIP, and several supervised baselines on MRI reconstruction tasks (fastMRI knee/brain, Stanford FSE) and on image inpainting (CBSD68), with reduced spectral bias and negligible overfitting. This unsupervised, instance-adaptive approach yields competitive or superior results across diverse datasets while maintaining data-consistency and robustness to distribution shifts, indicating strong practical potential for medical imaging and related inverse problems.
Abstract
The ability of deep image prior (DIP) to recover high-quality images from incomplete or corrupted measurements has made it popular in inverse problems in image restoration and medical imaging including magnetic resonance imaging (MRI). However, conventional DIP suffers from severe overfitting and spectral bias effects. In this work, we first provide an analysis of how DIP recovers information from undersampled imaging measurements by analyzing the training dynamics of the underlying networks in the kernel regime for different architectures. This study sheds light on important underlying properties for DIP-based recovery. Current research suggests that incorporating a reference image as network input can enhance DIP's performance in image reconstruction compared to using random inputs. However, obtaining suitable reference images requires supervision, and raises practical difficulties. In an attempt to overcome this obstacle, we further introduce a self-driven reconstruction process that concurrently optimizes both the network weights and the input while eliminating the need for training data. Our method incorporates a novel denoiser regularization term which enables robust and stable joint estimation of both the network input and reconstructed image. We demonstrate that our self-guided method surpasses both the original DIP and modern supervised methods in terms of MR image reconstruction performance and outperforms previous DIP-based schemes for image inpainting.
