Scalable Second-Order Optimization Algorithms for Minimizing Low-rank Functions
Edward Tansley, Coralia Cartis
TL;DR
This work addresses the challenge of applying second-order optimization to high-dimensional problems by exploiting low-rank structure via a random-subspace cubic regularization approach. It introduces R-ARC-D, an adaptive sketch-size variant of R-ARC that adjusts the subspace dimension based on observed Hessian rank, achieving the optimal $O(ε^{-3/2})$ convergence while keeping the sketch size $l_k$ on the order of the true rank $r$. Theoretical guarantees show that the adaptive rule preserves convergence rates under Gaussian embeddings, and numerical experiments on augmented low-rank CUTEst problems demonstrate substantial efficiency gains and rank-learning capabilities. The findings enhance the practicality of scalable second-order methods for high-dimensional, rank-constrained objectives with broad applicability in machine learning and hyperparameter optimization.
Abstract
We present a random-subspace variant of cubic regularization algorithm that chooses the size of the subspace adaptively, based on the rank of the projected second derivative matrix. Iteratively, our variant only requires access to (small-dimensional) projections of first- and second-order problem derivatives and calculates a reduced step inexpensively. The ensuing method maintains the optimal global rate of convergence of (full-dimensional) cubic regularization, while showing improved scalability both theoretically and numerically, particularly when applied to low-rank functions. When applied to the latter, our algorithm naturally adapts the subspace size to the true rank of the function, without knowing it a priori.
