A randomized algorithm to solve reduced rank operator regression
Giacomo Turri, Vladimir Kostic, Pietro Novelli, Massimiliano Pontil
TL;DR
This work tackles vector-valued regression between RKHSs by learning low-rank linear operators under regularized empirical risk minimization. It introduces Randomized Reduced Rank Regression (R$^4$), a Gaussian sketching framework that yields efficient primal and dual algorithms for computing $r$-rank estimators, with explicit error bounds on the expected risk that improve as the sketch size and power iterations grow. Theoretical results show that the randomized estimators approach the optimal risk, while numerical experiments on synthetic data, large-scale neuroscience data, and Koopman-operator regression demonstrate substantial speedups (including GPU-enabled large-scale gains) without sacrificing accuracy. The methods offer a tractable route to scalable operator learning in infinite-dimensional settings and open avenues for combining with Nyström or random features to further reduce complexity and extend applicability to broader vector-valued regression tasks. Overall, R$^4$ provides a principled, fast alternative to eigenvalue-based solvers for reduced-rank operator regression with strong theoretical guarantees and practical impact on large-scale problems.
Abstract
We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function (i.e. an operator) between sampled data via regularized empirical risk minimization with rank constraints. We propose Gaussian sketching techniques both for the primal and dual optimization objectives, yielding Randomized Reduced Rank Regression (R4) estimators that are efficient and accurate. For each of our R4 algorithms we prove that the resulting regularized empirical risk is, in expectation w.r.t. randomness of a sketch, arbitrarily close to the optimal value when hyper-parameteres are properly tuned. Numerical expreriments illustrate the tightness of our bounds and show advantages in two distinct scenarios: (i) solving a vector-valued regression problem using synthetic and large-scale neuroscience datasets, and (ii) regressing the Koopman operator of a nonlinear stochastic dynamical system.
