Simple Alternating Minimization Provably Solves Complete Dictionary Learning
Geyu Liang, Gavin Zhang, Salar Fattahi, Richard Y. Zhang
TL;DR
This work tackles noiseless complete dictionary learning by proposing a simple alternating minimization scheme that converges linearly to the ground truth under mild initialization. A data-driven preconditioning step makes the method effective for general complete dictionaries and enables scalable mini-batch and online updates without reliance on incoherence or RIP assumptions. Theoretical guarantees accompany practical algorithms, including exact support recovery and convergence bounds with explicit sample complexities, plus a warm-start initialization strategy. Empirical results on synthetic and real data demonstrate fast convergence and superior performance in image denoising and inpainting compared to KSVD and related methods, highlighting both theoretical and practical impact for large-scale dictionary learning.
Abstract
This paper focuses on the noiseless complete dictionary learning problem, where the goal is to represent a set of given signals as linear combinations of a small number of atoms from a learned dictionary. There are two main challenges faced by theoretical and practical studies of dictionary learning: the lack of theoretical guarantees for practically-used heuristic algorithms and their poor scalability when dealing with huge-scale datasets. Towards addressing these issues, we propose a simple and efficient algorithm that provably recovers the ground truth when applied to the nonconvex and discrete formulation of the problem in the noiseless setting. We also extend our proposed method to mini-batch and online settings where the data is huge-scale or arrives continuously over time. At the core of our proposed method lies an efficient preconditioning technique that transforms the unknown dictionary to a near-orthonormal one, for which we prove a simple alternating minimization technique converges linearly to the ground truth under minimal conditions. Our numerical experiments on synthetic and real datasets showcase the superiority of our method compared with the existing techniques.
