A Theory of the Risk for Optimization with Relaxation and its Application to Support Vector Machines
Marco C. Campi, Simone Garatti
TL;DR
This work develops a distribution-free theory linking the risk of a relaxed-constraint solution to an observable complexity $s^*$, enabling tight finite-sample risk bounds for data-driven optimization without assuming a known data distribution. It generalizes previous results to convex problems in vector spaces and specializes the theory to kernelized Support Vector methods (SVR, SVDD, SVM), providing explicit risk intervals $[\underline{\epsilon}(s^*),\overline{\epsilon}(s^*)]$ that hold with probability $1-\beta$. The asymptotic finding $V(x^*) \to s^*/N$ as $N\to\infty$ universally links risk to complexity, independent of the underlying data-generation process, and the finite-sample bounds guide hyperparameter tuning via cost–risk plots. The paper also addresses degeneracy in SVM via a heating technique and demonstrates numerical validation with a sinc regression example, illustrating practical use in selecting hyperparameters while controlling out-of-sample risk.
Abstract
In this paper we consider optimization with relaxation, an ample paradigm to make data-driven designs. This approach was previously considered by the same authors of this work in Garatti and Campi (2019), a study that revealed a deep-seated connection between two concepts: risk (probability of not satisfying a new, out-of-sample, constraint) and complexity (according to a definition introduced in paper Garatti and Campi (2019)). This connection was shown to have profound implications in applications because it implied that the risk can be estimated from the complexity, a quantity that can be measured from the data without any knowledge of the data-generation mechanism. In the present work we establish new results. First, we expand the scope of Garatti and Campi (2019) so as to embrace a more general setup that covers various algorithms in machine learning. Then, we study classical support vector methods - including SVM (Support Vector Machine), SVR (Support Vector Regression) and SVDD (Support Vector Data Description) - and derive new results for the ability of these methods to generalize. All results are valid for any finite size of the data set. When the sample size tends to infinity, we establish the unprecedented result that the risk approaches the ratio between the complexity and the cardinality of the data sample, regardless of the value of the complexity.
