Table of Contents
Fetching ...

Augmenting Subspace Optimization Methods with Linear Bandits

Matt Menickelly

TL;DR

The paper addresses derivative-costly unconstrained optimization by augmenting model-based subspace methods with a deterministic linear UCB bandit to select subspaces that align with the current gradient. It provides sublinear dynamic regret guarantees for the proposed Framework 2 algorithm and demonstrates practical implementations in both forward-mode AD and derivative-free settings, including the SS-POUNDers variant. Through extensive numerical experiments, the linear UCB augmentation consistently outperforms purely randomized subspace sketches, especially in high dimensions or low-dimensional effective subspace scenarios. The work advances subspace optimization by combining bandit-driven subspace selection with model-based updates, offering a scalable approach for problems where full-gradient information is expensive or unavailable.

Abstract

We consider the framework of methods for unconstrained minimization that are, in each iteration, restricted to a model that is only a valid approximation to the objective function on some affine subspace containing an incumbent point. These methods are of practical interest in computational settings where derivative information is either expensive or impossible to obtain. Recent attention has been paid in the literature to employing randomized matrix sketching for generating the affine subspaces within this framework. We consider a relatively straightforward, deterministic augmentation of such a generic subspace optimization method. In particular, we consider a sequential optimization framework where actions consist of one-dimensional linear subspaces and rewards consist of (approximations to) the magnitudes of directional derivatives computed in the direction of the action subspace. Reward maximization in this context is consistent with maximizing lower bounds on descent guaranteed by first-order Taylor models. This sequential optimization problem can be analyzed through the lens of dynamic regret. We modify an existing linear upper confidence bound (UCB) bandit method and prove sublinear dynamic regret in the subspace optimization setting. We demonstrate the efficacy of employing this linear UCB method in a setting where forward-mode algorithmic differentiation can provide directional derivatives in arbitrary directions and in a derivative-free setting. For the derivative-free setting, we propose SS-POUNDers, an extension of the derivative-free optimization method POUNDers that employs the linear UCB mechanism to identify promising subspaces. Our numerical experiments suggest a preference, in either computational setting, for employing a linear UCB mechanism within a subspace optimization method.

Augmenting Subspace Optimization Methods with Linear Bandits

TL;DR

The paper addresses derivative-costly unconstrained optimization by augmenting model-based subspace methods with a deterministic linear UCB bandit to select subspaces that align with the current gradient. It provides sublinear dynamic regret guarantees for the proposed Framework 2 algorithm and demonstrates practical implementations in both forward-mode AD and derivative-free settings, including the SS-POUNDers variant. Through extensive numerical experiments, the linear UCB augmentation consistently outperforms purely randomized subspace sketches, especially in high dimensions or low-dimensional effective subspace scenarios. The work advances subspace optimization by combining bandit-driven subspace selection with model-based updates, offering a scalable approach for problems where full-gradient information is expensive or unavailable.

Abstract

We consider the framework of methods for unconstrained minimization that are, in each iteration, restricted to a model that is only a valid approximation to the objective function on some affine subspace containing an incumbent point. These methods are of practical interest in computational settings where derivative information is either expensive or impossible to obtain. Recent attention has been paid in the literature to employing randomized matrix sketching for generating the affine subspaces within this framework. We consider a relatively straightforward, deterministic augmentation of such a generic subspace optimization method. In particular, we consider a sequential optimization framework where actions consist of one-dimensional linear subspaces and rewards consist of (approximations to) the magnitudes of directional derivatives computed in the direction of the action subspace. Reward maximization in this context is consistent with maximizing lower bounds on descent guaranteed by first-order Taylor models. This sequential optimization problem can be analyzed through the lens of dynamic regret. We modify an existing linear upper confidence bound (UCB) bandit method and prove sublinear dynamic regret in the subspace optimization setting. We demonstrate the efficacy of employing this linear UCB method in a setting where forward-mode algorithmic differentiation can provide directional derivatives in arbitrary directions and in a derivative-free setting. For the derivative-free setting, we propose SS-POUNDers, an extension of the derivative-free optimization method POUNDers that employs the linear UCB mechanism to identify promising subspaces. Our numerical experiments suggest a preference, in either computational setting, for employing a linear UCB mechanism within a subspace optimization method.

Paper Structure

This paper contains 17 sections, 8 theorems, 60 equations, 6 figures.

Key Result

Lemma 1

Let ass:f hold, in particular let $L$ be a global Lipschitz constant for $\nabla f$. Let a sequence of full column rank matrices $\{\mathbf{S}_k\}$ be given. For all $k=0,1,\dots, K$, the sequence of $\{\mathbf{x}_k\}$ generated by alg:linesearch satisfies where $\beta$ is the backtracking parameter from alg:linesearch.

Figures (6)

  • Figure 1: Values of $r_{i,p}$ (see \ref{['eq:ratio_def']}) shown as box-whisker plots on 30 random replications of applying \ref{['alg:practical']} to CUTEst problems of dimension $11\leq d\leq 100$. Top: Results with $p_k = \lceil 0.1 d\rceil$. Bottom: Results with $p_k = \lceil 0.01 d\rceil$.
  • Figure 2: Values of $r_{i,p}$ (see \ref{['eq:ratio_def']}) shown as box-whisker plots on 30 random replications of applying \ref{['alg:practical']} to CUTEst problems of dimension $101\leq d\leq 1000$. Top: Results with $p_k = \lceil 0.01 d\rceil$. Bottom: Results with $p_k = \lceil 0.001 d\rceil$.
  • Figure 3: Values of $r_{i,p}$ (see \ref{['eq:ratio_def']}) shown as box-whisker plots on 30 random replications of applying \ref{['alg:practical']} to CUTEst problems of dimension $1001\leq d\leq 10000$. Top: Results with $p_k = \lceil 0.01 d\rceil$. Bottom: Results with $p_k = \lceil 0.001 d\rceil$.
  • Figure 4: Data profiles comparing variants of SS-POUNDers with POUNDers in the budget-constrained setting of one budget unit ($d+2$ function evaluations) on midscale YATSOp problems. Left figure is tolerance $\tau=0.1$, center figure is $\tau=0.01$, and right figure is $\tau=0.001$.
  • Figure 5: Data profiles comparing variants of SS-POUNDers with POUNDers in the less budget-constrained setting of fifty budget units ($50(d+2)$ function evaluations) on midscale YATSOp problems. Left figure is tolerance $\tau=0.1$, center figure is $\tau=0.01$, and right figure is $\tau=0.001$.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • Lemma 4
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • ...and 4 more