Table of Contents
Fetching ...

Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

Puya Latafat, Andreas Themelis, Masoud Ahookhosh, Panagiotis Patrinos

Abstract

We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO, studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG and SARAH, that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman Moreau envelope as Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka-Łojasiewicz property.

Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

Abstract

We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO, studied for fully nonconvex problems when the sampling is randomized, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG and SARAH, that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman Moreau envelope as Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka-Łojasiewicz property.

Paper Structure

This paper contains 21 sections, 71 equations, 3 figures, 1 table, 3 algorithms.

Figures (3)

  • Figure 1: Representative convergence plots for problem \ref{['eq:regPR']} with squared loss on a digits image: (first row) $\ell_1$-regularization, (second row) $\ell_0$-norm ball constraint. The related plots for the QR code images follow a very similar trend and are therefore omitted.
  • Figure 2: Image recovery with corrupted measurements for tolerances $\@@set{10^{-5},10^{-7}}$. The sparsity parameters $\kappa=160$ and $\kappa=125$ are used for the digit and the QR code, respectively.
  • Figure 3: Representative convergence plots for the $l_1$-regularized problem with Poisson loss.

Theorems & Definitions (14)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • ...and 4 more