Preconditioned subgradient method for composite optimization: overparameterization and fast convergence
Mateo Díaz, Liwei Jiang, Abdel Ghani Labassi
TL;DR
This paper addresses the slow convergence of subgradient methods for composite optimization when the outer function is well-conditioned but the inner map is ill-conditioned or overparameterized. It introduces a Levenberg–Morrison–Marquardt type subgradient algorithm with a regularized Gauss–Newton preconditioner, and provides two practical parameter configurations that yield linear convergence rates depending only on the outer convex function h. The authors develop general regularity conditions on the parameterization and outer loss, covering both nonsmooth and smooth outer functions, and show that these conditions hold in important problems such as squared-variable formulations, matrix sensing, and CP tensor factorization. They also demonstrate that the theory extends to local regularity regimes and validate the approach with extensive numerical experiments, including robustness to outliers and dimension-independent convergence. Overall, the work offers a practical, theory-backed preconditioned method for fast convergence in a broad class of overparameterized and ill-conditioned composite optimization problems with significant implications for data science and signal processing.
Abstract
Composite optimization problems involve minimizing the composition of a smooth map with a convex function. Such objectives arise in numerous data science and signal processing applications, including phase retrieval, blind deconvolution, and collaborative filtering. The subgradient method achieves local linear convergence when the composite loss is well-conditioned. However, if the smooth map is, in a certain sense, ill-conditioned or overparameterized, the subgradient method exhibits much slower sublinear convergence even when the convex function is well-conditioned. To overcome this limitation, we introduce a Levenberg-Morrison-Marquardt subgradient method that converges linearly under mild regularity conditions at a rate determined solely by the convex function. Further, we demonstrate that these regularity conditions hold for several problems of practical interest, including square-variable formulations, matrix sensing, and tensor factorization. Numerical experiments illustrate the benefits of our method.
