Table of Contents
Fetching ...

Randomized block proximal method with locally Lipschitz continuous gradient

Pedro Pérez-Aros, David Torregrosa-Belén

TL;DR

This work addresses large-scale nonconvex optimization of the form $\min_x \varphi(x)=f(x)+g(x)$ with $f$ differentiable and $g$ block-separable, under only blockwise locally Lipschitz gradients for $f$ rather than global Lipschitz continuity. It introduces the Adaptive Randomized Block Proximal Gradient (ARBPG) method, which randomly selects blocks and uses an adaptive proximal stepsize to guarantee descent, optionally augmented by a boosted linesearch, without prior knowledge of local Lipschitz constants. The authors prove subsequential convergence to stationary points almost surely and establish a positive lower bound on stepsizes on bounded subsequences, ensuring progress. Numerical experiments on nonnegative matrix factorization for image compression demonstrate competitive performance and robustness to the relaxed gradient regularity assumptions, with code and data available publicly.

Abstract

Block-coordinate algorithms are recognized to furnish efficient iterative schemes for addressing large-scale problems, especially when the computation of full derivatives entails substantial memory requirements and computational efforts. In this paper, we investigate a randomized block proximal gradient algorithm for minimizing the sum of a differentiable function and a separable proper lower-semicontinuous function, both possibly nonconvex. In contrast to previous works, we only assume that the partial gradients of the differentiable function are locally Lipschitz continuous. At each iteration, the method adaptively selects a proximal stepsize to satisfy a sufficient decrease condition without prior knowledge of the local Lipschitz moduli of the partial gradients of the differentiable function. In addition, we incorporate the possibility of conducting an additional linesearch to enhance the performance of the algorithm. Our main result establishes subsequential convergence to a stationary point of the problem almost surely. Finally, we provide numerical validation of the method in an experiment in image compression using a nonnegative matrix factorization model.

Randomized block proximal method with locally Lipschitz continuous gradient

TL;DR

This work addresses large-scale nonconvex optimization of the form with differentiable and block-separable, under only blockwise locally Lipschitz gradients for rather than global Lipschitz continuity. It introduces the Adaptive Randomized Block Proximal Gradient (ARBPG) method, which randomly selects blocks and uses an adaptive proximal stepsize to guarantee descent, optionally augmented by a boosted linesearch, without prior knowledge of local Lipschitz constants. The authors prove subsequential convergence to stationary points almost surely and establish a positive lower bound on stepsizes on bounded subsequences, ensuring progress. Numerical experiments on nonnegative matrix factorization for image compression demonstrate competitive performance and robustness to the relaxed gradient regularity assumptions, with code and data available publicly.

Abstract

Block-coordinate algorithms are recognized to furnish efficient iterative schemes for addressing large-scale problems, especially when the computation of full derivatives entails substantial memory requirements and computational efforts. In this paper, we investigate a randomized block proximal gradient algorithm for minimizing the sum of a differentiable function and a separable proper lower-semicontinuous function, both possibly nonconvex. In contrast to previous works, we only assume that the partial gradients of the differentiable function are locally Lipschitz continuous. At each iteration, the method adaptively selects a proximal stepsize to satisfy a sufficient decrease condition without prior knowledge of the local Lipschitz moduli of the partial gradients of the differentiable function. In addition, we incorporate the possibility of conducting an additional linesearch to enhance the performance of the algorithm. Our main result establishes subsequential convergence to a stationary point of the problem almost surely. Finally, we provide numerical validation of the method in an experiment in image compression using a nonnegative matrix factorization model.

Paper Structure

This paper contains 14 sections, 8 theorems, 63 equations, 1 figure, 5 tables, 3 algorithms.

Key Result

lemma thmcounterlemma

Let $g\colon \mathbb{R}^n\to\overline{\mathbb{R}}$ be a proper lsc prox-bounded function with prox-boundedness threshold $\tau^{g}>0$. Given $v\in\mathbb{R}^n$, the function $g^{v}$ is prox-bounded with $\tau^{g^{v}}=\tau^{g}$. In addition, $\mathop{\mathrm{prox}}\nolimits_{\tau g^{v}}(\cdot) = \mat

Figures (1)

  • Figure 1: Original and low rank compression of four Chilean landscapes images obtained by each method.

Theorems & Definitions (17)

  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • proposition thmcounterproposition
  • proof
  • theorem 1
  • proof
  • ...and 7 more