Concentration inequalities for semidefinite least squares based on data
Filippo Fabiani, Andrea Simonetto
TL;DR
The paper tackles data-driven least-squares with semidefinite constraints by deriving a distribution-free finite-sample certificate for the spectrum of the relaxed solution: with probability at least $1-\delta$, $\\Lambda(F(x^*_N)) \in [m-\\varepsilon, L+\\varepsilon]$, where $\\varepsilon = \\frac{4B}{\\rho \\sqrt{N}} \\sqrt{\\lambda_{\\max}(H) \\ln(\\ell/\\delta)}$, and $N$ controls the tightening of this bound. This enables solving a simpler surrogate program in place of the full SDLS while guaranteeing spectral proximity to the constrained problem. The framework is illustrated via examples (e.g., PSD projection, Procrustes, kernel ridge with spectral constraints) and applied to learning an unknown quadratic, where a gradient-descent iterates on the surrogate enjoy an $O(\\varepsilon)$-accurate convergence to the true minimizer, with bounds on the learned Hessian and linear term. Numerical experiments corroborate the theory and demonstrate computational efficiency benefits of the relaxed approach, suggesting broad practical impact for data-driven optimization under spectral constraints.
Abstract
We study data-driven least squares (LS) problems with semidefinite (SD) constraints and derive finite-sample guarantees on the spectrum of their optimal solutions when these constraints are relaxed. In particular, we provide a high confidence bound allowing one to solve a simpler program in place of the full SDLS problem, while ensuring that the eigenvalues of the resulting solution are $\varepsilon$-close of those enforced by the SD constraints. The developed certificate, which consistently shrinks as the number of data increases, turns out to be easy-to-compute, distribution-free, and only requires independent and identically distributed samples. Moreover, when the SDLS is used to learn an unknown quadratic function, we establish bounds on the error between a gradient descent iterate minimizing the surrogate cost obtained with no SD constraints and the true minimizer.
