Table of Contents
Fetching ...

Stratified adaptive sampling for derivative-free stochastic trust-region optimization

Giovanni Amici, Sara Shashaani, Pranav Jain

Abstract

There is emerging evidence that trust-region (TR) algorithms are very effective at solving derivative-free nonconvex stochastic optimization problems in which the objective function is a Monte Carlo (MC) estimate. A recent strand of methodologies adaptively adjusts the sample size of the MC estimates by keeping the estimation error below a measure of stationarity induced from the TR radius. In this work we explore stratified adaptive sampling strategies to equip the TR framework with accurate estimates of the objective function, thus optimizing the required number of MC samples to reach a given ε-accuracy of the solution. We prove a reduced sample complexity, confirm a superior efficiency via numerical tests and applications, and explore inexpensive implementations in high dimension.

Stratified adaptive sampling for derivative-free stochastic trust-region optimization

Abstract

There is emerging evidence that trust-region (TR) algorithms are very effective at solving derivative-free nonconvex stochastic optimization problems in which the objective function is a Monte Carlo (MC) estimate. A recent strand of methodologies adaptively adjusts the sample size of the MC estimates by keeping the estimation error below a measure of stationarity induced from the TR radius. In this work we explore stratified adaptive sampling strategies to equip the TR framework with accurate estimates of the objective function, thus optimizing the required number of MC samples to reach a given ε-accuracy of the solution. We prove a reduced sample complexity, confirm a superior efficiency via numerical tests and applications, and explore inexpensive implementations in high dimension.

Paper Structure

This paper contains 15 sections, 9 theorems, 55 equations, 3 figures, 2 tables, 3 algorithms.

Key Result

Theorem 15

Let Assumptions ass:smooth, ass:map, and ass:strata hold. Then, given the MC sample size $n$ and the total number of strata $\ell=n/\overline{n}$, for $\overline{n}\in\mathbb{N}$, the conditional variance of the stratified sampling estimator has order of magnitude where where $a_{\max}=\max_{i}\{a_i\}$, $b_{\max}=\max_{i}\{b_i\}$, $\xi_{\max}=\max_{i}\{\xi_i\}$, $\kappa_{\min}=\min_{i}\{\kappa_i

Figures (3)

  • Figure 1: Mean objective function value, and 80% confidence band, as a function of the MC sample size (denoted as Budget). Optimization problems: \ref{['eq:ex1']}, \ref{['eq:ex2']}, \ref{['eq:ex3']} (top-left, top-right, and bottom figure, respectively). Algorithms: SASTRODF-2, ASTRODF-C, and TRODF implemented via Algorithms \ref{['alg:sastro-df']}-\ref{['alg:aux']}. SASTRODF-2 computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with $\overline{n}=2$ and $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma']}. ASTRODF-C computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with one fixed strata $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma_NS']} (consistently with Chebyshev-like bounds, see Section \ref{['sec:comparison']}). TRODF computed with one fixed strata and fixed sample size over all iterations.
  • Figure 2: Objective function value as a function of the MC sample size (denoted as Budget). Optimization problem: \ref{['eq:portfolio_general']}. Two initial guesses (left and right plot, respectively). Algorithms: SASTRODF-2, ASTRODF-C, and TRODF implemented via Algorithms \ref{['alg:sastro-df']}-\ref{['alg:aux']}. SASTRODF-2 computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with $\overline{n}=2$ and $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma']}. ASTRODF-C computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with one fixed strata $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma_NS']} (consistently with Chebyshev-like bounds, see Section \ref{['sec:comparison']}). TRODF computed with one fixed strata and fixed sample size over all iterations.
  • Figure 3: Average of problems solved by each algorithm as a function of the fraction of budget used. Optimization problems: \ref{['eq:ex1']}, \ref{['eq:ex2']}, \ref{['eq:ex3']}, \ref{['eq:portfolio_general']}. Algorithms: SASTRODF-2, SASTRODF-3, ASTRODF-C, ASTRO-B, and TRODF implemented via Algorithms \ref{['alg:sastro-df']}-\ref{['alg:aux']}. SASTRODF-2 (SASTRODF-3) computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with $\overline{n}=2$ ($\overline{n}=3$) and $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma']}. ASTRODF-C computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with one fixed strata and $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma_NS']} (consistently with Chebyshev-like bounds, see Section \ref{['sec:comparison']}). ASTRODF-B computed with Algorithm \ref{['alg:sastro-df']}-\ref{['alg:aux']}, with one fixed strata and $\lambda_k,\gamma$ set as in \ref{['eq:lambda_gamma_NS_Bernstein']} (consistently with Bernstein-like bounds, see Section \ref{['sec:comparison']}). TRODF computed with one fixed strata and fixed sample size over all iterations.

Theorems & Definitions (26)

  • Definition 1
  • Definition 2
  • Remark 6
  • Definition 7
  • Remark 8
  • Remark 9
  • Definition 10
  • Definition 11
  • Definition 13
  • Definition 14
  • ...and 16 more