Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems trained with Gradient Descent
Nathan Buskulic, Jalal Fadili, Yvain Quéau
TL;DR
The paper addresses the theoretical guarantees for unsupervised neural networks solving inverse problems when trained with gradient descent. By extending Kurdyka-Łojasiewicz-based recovery guarantees from gradient flow to discrete gradient-descent dynamics with a suitable learning rate $\gamma$, it shows convergence to zero loss and provable recovery under a restricted injectivity condition, with discretization effects captured by a fixed constant. It further derives probabilistic overparametrization bounds for a two-layer DIP to achieve these guarantees with high probability, and provides numerical validation on synthetic and image-like data illustrating the trade-offs between conditioning, network width, and early stopping. Overall, the work provides a rigorous bridge between continuous-time guarantees and practical, discrete optimization for unsupervised inverse problems, guiding architecture choices and step-size regimes for reliable reconstructions.
Abstract
Advanced machine learning methods, and more prominently neural networks, have become standard to solve inverse problems over the last years. However, the theoretical recovery guarantees of such methods are still scarce and difficult to achieve. Only recently did unsupervised methods such as Deep Image Prior (DIP) get equipped with convergence and recovery guarantees for generic loss functions when trained through gradient flow with an appropriate initialization. In this paper, we extend these results by proving that these guarantees hold true when using gradient descent with an appropriately chosen step-size/learning rate. We also show that the discretization only affects the overparametrization bound for a two-layer DIP network by a constant and thus that the different guarantees found for the gradient flow will hold for gradient descent.
