Table of Contents
Fetching ...

An Inexact Preconditioned Zeroth-order Proximal Method for Composite Optimization

Shanglin Liu, Lei Wang, Nachuan Xiao, Xin Liu

TL;DR

A preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points based on the function values at the same trial points is proposed.

Abstract

In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points. We establish the global convergence and worst-case complexity for our proposed method. Numerical experiments exhibit the superiority of our developed method.

An Inexact Preconditioned Zeroth-order Proximal Method for Composite Optimization

TL;DR

A preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points based on the function values at the same trial points is proposed.

Abstract

In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points. We establish the global convergence and worst-case complexity for our proposed method. Numerical experiments exhibit the superiority of our developed method.
Paper Structure (15 sections, 7 theorems, 51 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 15 sections, 7 theorems, 51 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Lemma 2

Let Assumption asp:function hold. For any $\gamma > 0$, we denote $\hat{x} := \mathrm{prox}_{\gamma r}(x - \gamma \nabla f(x))$. Suppose that the point $x \in \mathbb{R}^n$ satisfies the following condition, where $\epsilon > 0$ is a small constant. Then it holds that

Figures (2)

  • Figure 1: Comparison between IPZOPM and ZOPG in solving LASSO problem.
  • Figure 2: Comparison between IPZOPM and ZOPG in solving binary classification problems.

Theorems & Definitions (15)

  • Lemma 2
  • proof
  • Definition 3
  • Lemma 5
  • proof
  • Lemma 6
  • proof
  • Proposition 7
  • proof
  • Lemma 8
  • ...and 5 more