Faster Randomized Methods for Orthogonality Constrained Problems
Boris Shustin, Haim Avron
TL;DR
The paper addresses optimization problems with generalized orthogonality constraints by integrating randomized preconditioning into the Riemannian optimization framework. The authors design constant SPD metrics ${\bf M}$ constructed from data sketches to approximate Gram matrices, yielding well-conditioned Riemannian Hessians and faster convergence for problems like canonical correlation analysis (CCA) and Fisher linear discriminant analysis (FDA). They provide theoretical guarantees bounding the Hessian condition number via sketch quality and demonstrate substantial empirical speedups on real and synthetic datasets, including warm-start benefits. The approach reduces preprocessing costs to near-linear in data size while preserving convergence behavior across zero, first-, and second-order Riemannian methods. Overall, the work offers a practical, theory-backed pathway to scalable optimization under orthogonality constraints with broad applicability in data analysis tasks.
Abstract
Recent literature has advocated the use of randomized methods for accelerating the solution of various matrix problems arising throughout data science and computational science. One popular strategy for leveraging randomization is to use it as a way to reduce problem size. However, methods based on this strategy lack sufficient accuracy for some applications. Randomized preconditioning is another approach for leveraging randomization, which provides higher accuracy. The main challenge in using randomized preconditioning is the need for an underlying iterative method, thus randomized preconditioning so far have been applied almost exclusively to solving regression problems and linear systems. In this article, we show how to expand the application of randomized preconditioning to another important set of problems prevalent across data science: optimization problems with (generalized) orthogonality constraints. We demonstrate our approach, which is based on the framework of Riemannian optimization and Riemannian preconditioning, on the problem of computing the dominant canonical correlations and on the Fisher linear discriminant analysis problem. For both problems, we evaluate the effect of preconditioning on the computational costs and asymptotic convergence, and demonstrate empirically the utility of our approach.
