Table of Contents
Fetching ...

Trust-Region Methods with Low-Fidelity Objective Models

Andrea Angino, Matteo Aurina, Alena Kopaničáková, Matthias Voigt, Marco Donatelli, Rolf Krause

TL;DR

Large-scale unconstrained binary classification is cast as $f(w)=\frac{1}{q}\sum_{i=1}^q \ell(w;x_i,y_i)$ and optimized via two multifidelity trust-region methods within the Magical Trust Region framework. STR and SVDTR augment the high-fidelity step with a low-fidelity direction obtained from a data-projection $S_k\in\mathbb{R}^{t\times n}$, using a random sketch in STR or a fixed SVD-based projector in SVDTR; the composite step is $p_k = p_k^{\mathrm{H}} + \alpha_k S_k^{\top} p_k^{\mathrm{L}}$. Empirical results on LIBSVM datasets show consistent reductions in outer-iteration counts and improved convergence, particularly as the reduced dimension $t$ increases and when the high-fidelity subproblem solvers introduce larger costs. The work demonstrates that data-driven coarse spaces can meaningfully accelerate trust-region optimization for high-dimensional, finite-sum objectives, with accompanying code available.

Abstract

We introduce two multifidelity trust-region methods based on the Magical Trust Region (MTR) framework. MTR augments the classical trust-region step with a secondary, informative direction. In our approaches, the secondary ``magical'' directions are determined by solving coarse trust-region subproblems based on low-fidelity objective models. The first proposed method, Sketched Trust-Region (STR), constructs this secondary direction using a sketched matrix to reduce the dimensionality of the trust-region subproblem. The second method, SVD Trust-Region (SVDTR), defines the magical direction via a truncated singular value decomposition of the dataset, capturing the leading directions of variability. Several numerical examples illustrate the potential gain in efficiency.

Trust-Region Methods with Low-Fidelity Objective Models

TL;DR

Large-scale unconstrained binary classification is cast as and optimized via two multifidelity trust-region methods within the Magical Trust Region framework. STR and SVDTR augment the high-fidelity step with a low-fidelity direction obtained from a data-projection , using a random sketch in STR or a fixed SVD-based projector in SVDTR; the composite step is . Empirical results on LIBSVM datasets show consistent reductions in outer-iteration counts and improved convergence, particularly as the reduced dimension increases and when the high-fidelity subproblem solvers introduce larger costs. The work demonstrates that data-driven coarse spaces can meaningfully accelerate trust-region optimization for high-dimensional, finite-sum objectives, with accompanying code available.

Abstract

We introduce two multifidelity trust-region methods based on the Magical Trust Region (MTR) framework. MTR augments the classical trust-region step with a secondary, informative direction. In our approaches, the secondary ``magical'' directions are determined by solving coarse trust-region subproblems based on low-fidelity objective models. The first proposed method, Sketched Trust-Region (STR), constructs this secondary direction using a sketched matrix to reduce the dimensionality of the trust-region subproblem. The second method, SVD Trust-Region (SVDTR), defines the magical direction via a truncated singular value decomposition of the dataset, capturing the leading directions of variability. Several numerical examples illustrate the potential gain in efficiency.

Paper Structure

This paper contains 3 sections, 9 equations, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1.1: Convergence histories of TR (solid black), STR (solid), and SVDTR (dashed) for solving \ref{['eq:Problem']}. Top: Australian with $f_{\mathrm{LS}}$ using CP (left) and ST–CG (right). Bottom: Mushroom with $f_{\mathrm{LL}}$ under the same full–space solvers. Legend entries for STR/SVDTR indicate the reduced dimension $t$ as a percentage of the feature dimension $n$.
  • Figure 1.2: Convergence histories of TR (solid black), STR (solid), and SVDTR (dashed) for solving \ref{['eq:Problem']} with $f_{\mathrm{LS}}$, all using CP. Left: $[\lVert \nabla f_{\mathrm{LL}} \rVert]_{2}$ versus iteration count; right: ${\lVert \nabla f_{\mathrm{LS}} \rVert}_{2}$ versus wall-clock time (s). Legend entries for STR/SVDTR indicate the reduced dimension $t$ as a percentage of $n$.
  • Figure 1.3: Convergence histories on the Gisette dataset of TR (solid black), STR (solid), and SVDTR (dashed) for solving \ref{['eq:Problem']} with $f_{\mathrm{LL}}$. Top: ST–CG full–space solver. Bottom: CP full–space solver. Legend entries for STR/SVDTR indicate the reduced dimension $t$ as a percentage of $n$.