Table of Contents
Fetching ...

Learning the subspace of variation for global optimization of functions with low effective dimension

Coralia Cartis, Xinzhu Liang, Estelle Massart, Adilet Otemissov

TL;DR

This proposal replaces the original high-dimensional problem by one or several lower-dimensional reduced subproblem(s), capturing the main directions of variation of the objective which are estimated here as the principal components of a collection of sampled gradients.

Abstract

We propose an algorithmic framework, that employs active subspace techniques, for scalable global optimization of functions with low effective dimension (also referred to as low-rank functions). This proposal replaces the original high-dimensional problem by one or several lower-dimensional reduced subproblem(s), capturing the main directions of variation of the objective which are estimated here as the principal components of a collection of sampled gradients. We quantify the sampling complexity of estimating the subspace of variation of the objective in terms of its effective dimension and hence, bound the probability that the reduced problem will provide a solution to the original problem. To account for the practical case when the effective dimension is not known a priori, our framework adaptively solves a succession of reduced problems, increasing the number of sampled gradients until the estimated subspace of variation remains unchanged. We prove global convergence under mild assumptions on the objective, the sampling distribution and the subproblem solver, and illustrate numerically the benefits of our proposed algorithms over those using random embeddings.

Learning the subspace of variation for global optimization of functions with low effective dimension

TL;DR

This proposal replaces the original high-dimensional problem by one or several lower-dimensional reduced subproblem(s), capturing the main directions of variation of the objective which are estimated here as the principal components of a collection of sampled gradients.

Abstract

We propose an algorithmic framework, that employs active subspace techniques, for scalable global optimization of functions with low effective dimension (also referred to as low-rank functions). This proposal replaces the original high-dimensional problem by one or several lower-dimensional reduced subproblem(s), capturing the main directions of variation of the objective which are estimated here as the principal components of a collection of sampled gradients. We quantify the sampling complexity of estimating the subspace of variation of the objective in terms of its effective dimension and hence, bound the probability that the reduced problem will provide a solution to the original problem. To account for the practical case when the effective dimension is not known a priori, our framework adaptively solves a succession of reduced problems, increasing the number of sampled gradients until the estimated subspace of variation remains unchanged. We prove global convergence under mild assumptions on the objective, the sampling distribution and the subproblem solver, and illustrate numerically the benefits of our proposed algorithms over those using random embeddings.
Paper Structure (22 sections, 12 theorems, 60 equations, 9 figures, 4 tables, 6 algorithms)

This paper contains 22 sections, 12 theorems, 60 equations, 9 figures, 4 tables, 6 algorithms.

Key Result

Lemma 2.6

Let ass:f_low_base and ass:C1 hold. Then, $\nabla f(\boldsymbol x) \in \mathcal{T}$ for all $\boldsymbol x \in \mathbb{R}^D$.

Figures (9)

  • Figure 1: Example of function with low effective dimension cartisOtemissov2022.
  • Figure 2: Illustration of the function $\bar{f}$ defined in \ref{['eq:polynomial_example']}.
  • Figure 3: Comparing ASM-1, A-ASM, REGO-1, A-REGO and no-embedding (with mKNITRO) for functions with low effective dimensionality, in terms of function evaluation counts. Lines with the same colour represent three different realisations of an algorithm.
  • Figure 4: Comparing ASM-1, A-ASM, REGO-1, A-REGO and no-embedding (with mKNITRO) for functions with low effective dimensionality. The performance metric here is CPU time in seconds. Lines with the same colour represent three different realisations of an algorithm.
  • Figure 5: Computational costs of ASM-1, A-ASM, REGO-1 and A-REGO for the Rosenbrock function with $d_e \in \{10,20,50\}$ and $D = 100$ using three random seeds.
  • ...and 4 more figures

Theorems & Definitions (34)

  • Definition 1.1
  • Definition 2.1: Definition 1 in Wang et al Wang2016
  • Example 2.2: Extracted from cartisOtemissov2022
  • Definition 2.4
  • Lemma 2.6
  • proof
  • Lemma 2.7
  • proof
  • Lemma 2.8
  • proof
  • ...and 24 more