Table of Contents
Fetching ...

Zeroth-order Low-rank Hessian Estimation via Matrix Recovery

Tianyu Wang, Zicheng Wang, Jiajia Yu

TL;DR

It is proved that for a Hessian matrix H, proper zeroth-order finite-difference computations ensures a highly probable exact recovery of $H$.

Abstract

A zeroth-order Hessian estimator aims to recover the Hessian matrix of an objective function at any given point, using minimal finite-difference computations. This paper studies zeroth-order Hessian estimation for low-rank Hessians, from a matrix recovery perspective. Our challenge lies in the fact that traditional matrix recovery techniques are not directly suitable for our scenario. They either demand incoherence assumptions (or its variants), or require an impractical number of finite-difference computations in our setting. To overcome these hurdles, we employ zeroth-order Hessian estimations aligned with proper matrix measurements, and prove new recovery guarantees for these estimators. More specifically, we prove that for a Hessian matrix $H \in \mathbb{R}^{n \times n}$ of rank $r$, $ \mathcal{O}(nr^2 \log^2 n ) $ proper zeroth-order finite-difference computations ensures a highly probable exact recovery of $H$. Compared to existing methods, our method can greatly reduce the number of finite-difference computations, and does not require any incoherence assumptions.

Zeroth-order Low-rank Hessian Estimation via Matrix Recovery

TL;DR

It is proved that for a Hessian matrix H, proper zeroth-order finite-difference computations ensures a highly probable exact recovery of .

Abstract

A zeroth-order Hessian estimator aims to recover the Hessian matrix of an objective function at any given point, using minimal finite-difference computations. This paper studies zeroth-order Hessian estimation for low-rank Hessians, from a matrix recovery perspective. Our challenge lies in the fact that traditional matrix recovery techniques are not directly suitable for our scenario. They either demand incoherence assumptions (or its variants), or require an impractical number of finite-difference computations in our setting. To overcome these hurdles, we employ zeroth-order Hessian estimations aligned with proper matrix measurements, and prove new recovery guarantees for these estimators. More specifically, we prove that for a Hessian matrix of rank , proper zeroth-order finite-difference computations ensures a highly probable exact recovery of . Compared to existing methods, our method can greatly reduce the number of finite-difference computations, and does not require any incoherence assumptions.
Paper Structure (12 sections, 12 theorems, 82 equations, 1 figure)

This paper contains 12 sections, 12 theorems, 82 equations, 1 figure.

Key Result

Proposition 1

Consider an estimator defined in (eq:hess-est). Let the underlying function $f$ be twice continuously differentiable. Let $\mathbf{u} , \mathbf{v}$ be two random vectors such that $\| \mathbf{u} \| , \| \mathbf{v} \| < \infty$$a.s$. Then for any fixed $\mathbf{x} \in \mathbb{R}^n$, as $\delta \to 0_+$, where $\to_d$ denotes convergence in distribution.

Figures (1)

  • Figure 1: Incoherence condition for $\nabla^2 f ( \mathbf{x} )$ at multiple points. When the Hessian of $f$ is low-rank or approximately low-rank, a matrix completion guarantee for $\nabla^2 f ( \mathbf{x} )$ at all $\mathbf{x}$ requires an incoherence condition to hold uniformly over $\mathbf{x}$. As illustrated in the right subfigure, such requirement is overly restrictive.

Theorems & Definitions (22)

  • Proposition 1
  • proof
  • Theorem 1
  • Corollary 1
  • Lemma 1
  • proof
  • proof : Sketch of proof of Theorem \ref{['thm:main']} with (A1) and (A2) assumed
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 12 more