Table of Contents
Fetching ...

Coordinate projected gradient descent minimization and its application to orthogonal nonnegative matrix factorization

Flavia Chorobura, Daniela Lupu, Ion Necoara

TL;DR

The paper develops a cyclic Coordinate Projected Gradient Descent (CPGD) method for large-scale, nonconvex optimization problems of the form $F(x)= f(x) + \psi(x) + \phi(x)$, where $f$ has coordinate-wise Lipschitz gradients and $\psi$ may be nonseparable. It introduces an adaptive stepsize that guarantees descent and proves KL-based convergence rates, with a worst-case complexity analysis. The approach is instantiated to penalized Orthogonal Nonnegative Matrix Factorization (ONMF), yielding explicit block updates and a cubic-root based stepsize computation, and it demonstrates superior performance over a BMM baseline on the Salinas dataset. The results offer a scalable, theoretically grounded framework for three-term composite nonconvex problems and a practical, effective solver for ONMF with separable constraints.

Abstract

In this paper we consider large-scale composite nonconvex optimization problems having the objective function formed as a sum of three terms, first has block coordinate-wise Lipschitz continuous gradient, second is twice differentiable but nonseparable and third is the indicator function of some separable closed convex set. Under these general settings we derive and analyze a new cyclic coordinate descent method, which uses the partial gradient of the differentiable part of the objective, yielding a coordinate gradient descent scheme with a novel adaptive stepsize rule. We prove that this stepsize rule makes the coordinate gradient scheme a descent method, provided that additional assumptions hold for the second term in the objective function. We also present a worst-case complexity analysis for this new method in the nonconvex settings. Numerical results on orthogonal nonnegative matrix factorization problem also confirm the efficiency of our algorithm.

Coordinate projected gradient descent minimization and its application to orthogonal nonnegative matrix factorization

TL;DR

The paper develops a cyclic Coordinate Projected Gradient Descent (CPGD) method for large-scale, nonconvex optimization problems of the form , where has coordinate-wise Lipschitz gradients and may be nonseparable. It introduces an adaptive stepsize that guarantees descent and proves KL-based convergence rates, with a worst-case complexity analysis. The approach is instantiated to penalized Orthogonal Nonnegative Matrix Factorization (ONMF), yielding explicit block updates and a cubic-root based stepsize computation, and it demonstrates superior performance over a BMM baseline on the Salinas dataset. The results offer a scalable, theoretically grounded framework for three-term composite nonconvex problems and a practical, effective solver for ONMF with separable constraints.

Abstract

In this paper we consider large-scale composite nonconvex optimization problems having the objective function formed as a sum of three terms, first has block coordinate-wise Lipschitz continuous gradient, second is twice differentiable but nonseparable and third is the indicator function of some separable closed convex set. Under these general settings we derive and analyze a new cyclic coordinate descent method, which uses the partial gradient of the differentiable part of the objective, yielding a coordinate gradient descent scheme with a novel adaptive stepsize rule. We prove that this stepsize rule makes the coordinate gradient scheme a descent method, provided that additional assumptions hold for the second term in the objective function. We also present a worst-case complexity analysis for this new method in the nonconvex settings. Numerical results on orthogonal nonnegative matrix factorization problem also confirm the efficiency of our algorithm.

Paper Structure

This paper contains 5 sections, 5 theorems, 52 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Lemma 1

If Assumption 1 holds, then the iterates of algorithm CPGD satisfy the following descent:

Figures (1)

  • Figure 1: Salinas dataset for $r = 15$: evolution of objective function values (top figure) and orthogonal error (bottom figure) with respect to time of BMM and CPGD

Theorems & Definitions (11)

  • Definition 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Theorem 3
  • ...and 1 more