Inertial Bregman Proximal Gradient under Partial Smoothness
Jean-Jacques Godeme
TL;DR
The paper develops an Inertial Bregman Proximal Gradient (IBPG) method for minimizing $\Phi(x)=F(x)+G(x)$ in finite dimensions, where $F$ may be nonconvex and $G$ is convex. By leveraging a Bregman entropy $\psi$ to realize relative smoothness and the triangle scaling property (TSP), IBPG achieves acceleration in a broad non-Euclidean geometry, with global convergence guaranteed under the Kurdyka-Łojasiewicz (KL) framework when the entropy is strongly convex. Locally, if the nonsmooth part is partly smooth relative to a smooth manifold, the method enjoys finite activity identification and local linear convergence via a spectral analysis of a linearized map; in the smooth case, the approach also exhibits trap-avoidance properties around strict saddles. The authors illustrate the theory with phase retrieval experiments under various regularizers, demonstrating finite active-set identification and rates consistent with the developed theory, highlighting the practical impact for structured nonsmooth optimization in imaging and inverse problems.
Abstract
This work considers an Inertial version of Bregman Proximal Gradient algorithm (IBPG) for minimizing the sum of two single-valued functions in finite dimension. We suppose that one of the functions is proper, closed, and convex but non-necessarily smooth whilst the second is a smooth enough function but not necessarily convex. For the latter, we ask the smooth adaptable property (smad) with respect to some kernel or entropy which allows to remove the very popular global Lipschitz continuity requirement on the gradient of the smooth part. We consider the IBPG under the framework of the triangle scaling property (TSP) which is a geometrical property for which one can provably ensure acceleration for a certain subset of kernel/entropy functions in the convex setting. Based on this property, we provide global convergence guarantees when the entropy is strongly convex under the framework of the Kurdyka-Łojasiewicz (KL) property. Turning to the local convergence properties, we show that when the nonsmooth part is partly smooth relative to a smooth submanifold, IBPG has a finite activity identification property before entering a local linear convergence regime for which we establish a sharp estimate of the convergence rate. We report numerical simulations to illustrate our theoretical results on low complexity regularized phase retrieval.
