Two-level trust-region method with random subspaces

Andrea Angino; Alena Kopaničáková; Rolf Krause

Two-level trust-region method with random subspaces

Andrea Angino, Alena Kopaničáková, Rolf Krause

TL;DR

This work tackles the high computational cost of unconstrained nonlinear optimization by introducing TLTR, a two-level trust-region algorithm that fuses full-space and random subspace information. A composite search direction $p_k = p_k^F + a_k S_k^T p_k^S$ is formed, with $p_k^F$ from a full-space TR step and $p_k^S$ from a sketched subspace around $x_{k+1/2}=x_k+p_k^F$. Sketching via Gaussian and s-hashing generates the random subspaces, allowing cheap subspace solves while preserving convergence. Numerical tests on logistic and least-squares losses show TLTR outperforms classical TR and sketched Newton, with larger gains as $n$ and conditioning increase, indicating strong potential for large-scale ML optimization.

Abstract

We introduce a two-level trust-region method (TLTR) for solving unconstrained nonlinear optimization problems. Our method uses a composite iteration step, which is based on two distinct search directions. The first search direction is obtained through minimization in the full/high-resolution space, ensuring global convergence to a critical point. The second search direction is obtained through minimization in the randomly generated subspace, which, in turn, allows for convergence acceleration. The efficiency of the proposed TLTR method is demonstrated through numerical experiments in the field of machine learning

Two-level trust-region method with random subspaces

TL;DR

is formed, with

from a full-space TR step and

from a sketched subspace around

. Sketching via Gaussian and s-hashing generates the random subspaces, allowing cheap subspace solves while preserving convergence. Numerical tests on logistic and least-squares losses show TLTR outperforms classical TR and sketched Newton, with larger gains as

and conditioning increase, indicating strong potential for large-scale ML optimization.

Abstract

Paper Structure (5 sections, 5 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 5 sections, 5 equations, 3 figures, 1 table, 1 algorithm.

Introduction
Two-level TR (TLTR) with random subspaces
The TLTR algorithm
The computational cost of the TLTR method
Numerical examples

Figures (3)

Figure 1: Convergence history of TR and TLTR for solving \ref{['eq:Problem']} with $f_{LL}$ (first row) and $f_{LS}$ (second row) with Gaussian (dashed lines, Australian dataset) and s-hashing (solid lines, Mushroom, $s = \left\lceil \ell/4\right\rceil$) for subspaces of varying size $\ell = \lceil n P\rceil$, where $P$ denotes the portion of the full-space parameters. To solve QP problems on full space, we use 2 iterations of ST-CG, or CP methods.
Figure 2: Left: Convergence of the TLTR method with $s$-hashing strategy ($\ell=\left\lceil n/5 \right\rceil$) for different values of sketching parameter $s$ for logistic loss with Mushroom dataset. Right: Convergence history of TR and TLTR for the least-square loss minimization problem with Gisette dataset.
Figure 3: Convergence history of TR, SN and TLTR with Gaussian (dashed lines, Australian dataset) and s-hashing (solid lines, Gissette/Mushroom) subspaces. The subspace sizes are chosen using a portion of full space $n$ stated in brackets. The QP problems on the full space are solved using CP/ST-CG (2/5 its) methods (Top/Bottom row).

Two-level trust-region method with random subspaces

TL;DR

Abstract

Two-level trust-region method with random subspaces

Authors

TL;DR

Abstract

Table of Contents

Figures (3)