Table of Contents
Fetching ...

On Reductions and Representations of Learning Problems in Euclidean Spaces

Bogdan Chornomaz, Shay Moran, Tom Waknine

TL;DR

This paper establishes bounds on the minimum Euclidean dimension D needed to reduce a concept class with VC dimension d to a Stochastic Convex Optimization (SCO) problem in ℝD, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class.

Abstract

Many practical prediction algorithms represent inputs in Euclidean space and replace the discrete 0/1 classification loss with a real-valued surrogate loss, effectively reducing classification tasks to stochastic optimization. In this paper, we investigate the expressivity of such reductions in terms of key resources, including dimension and the role of randomness. We establish bounds on the minimum Euclidean dimension $D$ needed to reduce a concept class with VC dimension $d$ to a Stochastic Convex Optimization (SCO) problem in $\mathbb{R}^D$, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class. To achieve this, we develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. Perhaps surprisingly, we show that, in some cases, the number of parameters $D$ must be exponentially larger than the VC dimension $d$, even if the reduction is only slightly non-trivial. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness, as seen in techniques like random initialization. This result resolves an open question posed by Kamath, Montasser, and Srebro (COLT 2020). Our findings introduce new variants of \emph{dimension complexity} (also known as \emph{sign-rank}), a well-studied parameter in learning and complexity theory. Specifically, we define an approximate version of sign-rank and another variant that captures the minimum dimension required for a reduction to SCO. We also propose several open questions and directions for future research.

On Reductions and Representations of Learning Problems in Euclidean Spaces

TL;DR

This paper establishes bounds on the minimum Euclidean dimension D needed to reduce a concept class with VC dimension d to a Stochastic Convex Optimization (SCO) problem in ℝD, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class.

Abstract

Many practical prediction algorithms represent inputs in Euclidean space and replace the discrete 0/1 classification loss with a real-valued surrogate loss, effectively reducing classification tasks to stochastic optimization. In this paper, we investigate the expressivity of such reductions in terms of key resources, including dimension and the role of randomness. We establish bounds on the minimum Euclidean dimension needed to reduce a concept class with VC dimension to a Stochastic Convex Optimization (SCO) problem in , formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class. To achieve this, we develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. Perhaps surprisingly, we show that, in some cases, the number of parameters must be exponentially larger than the VC dimension , even if the reduction is only slightly non-trivial. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness, as seen in techniques like random initialization. This result resolves an open question posed by Kamath, Montasser, and Srebro (COLT 2020). Our findings introduce new variants of \emph{dimension complexity} (also known as \emph{sign-rank}), a well-studied parameter in learning and complexity theory. Specifically, we define an approximate version of sign-rank and another variant that captures the minimum dimension required for a reduction to SCO. We also propose several open questions and directions for future research.

Paper Structure

This paper contains 19 sections, 24 theorems, 69 equations, 2 figures.

Key Result

Theorem 1

Let $C\subseteq\{\pm 1\}^X$ be a binary concept class. If for some $\beta<1/2$ and $\alpha > 0$ there exists an $(\alpha,\beta)$-reduction from the task of learning $C$ in the realizable case to a stochastic convex optimization task in $\mathbb{R}^d$, with loss functions $\{\ell_z\}_{z\in Z}$ satisf

Figures (2)

  • Figure 1: A reduction from problem $A$, which we wish to solve, to problem $B$, which we can solve. The reduction maps instances of $A$ to instances of $B$, and solutions of $B$ back to solutions of $A$. The reduction is successful if, when combined with an algorithm for $B$, it solves problem $A$.
  • Figure 2: An $(\alpha,\beta)$-reduction.

Theorems & Definitions (67)

  • Definition 1: Learning task
  • Example 2: PAC-learning
  • Example 3: PAC-learning for partial concept classes
  • Example 4: Stochastic convex optimization
  • Example 5: General setting of learning
  • Definition 6: Reductions
  • Theorem 1: Binary classification vs. stochastic convex optimization
  • Example 7: SVM with unregularized hinge loss
  • Example 8: Hard SVM
  • Theorem 2: A variant of Theorem \ref{['t:sco']} for $\infty$-valued SCO
  • ...and 57 more