On Reductions and Representations of Learning Problems in Euclidean Spaces

Bogdan Chornomaz; Shay Moran; Tom Waknine

On Reductions and Representations of Learning Problems in Euclidean Spaces

Bogdan Chornomaz, Shay Moran, Tom Waknine

TL;DR

This paper establishes bounds on the minimum Euclidean dimension D needed to reduce a concept class with VC dimension d to a Stochastic Convex Optimization (SCO) problem in ℝD, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class.

Abstract

Many practical prediction algorithms represent inputs in Euclidean space and replace the discrete 0/1 classification loss with a real-valued surrogate loss, effectively reducing classification tasks to stochastic optimization. In this paper, we investigate the expressivity of such reductions in terms of key resources, including dimension and the role of randomness. We establish bounds on the minimum Euclidean dimension $D$ needed to reduce a concept class with VC dimension $d$ to a Stochastic Convex Optimization (SCO) problem in $\mathbb{R}^D$, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class. To achieve this, we develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. Perhaps surprisingly, we show that, in some cases, the number of parameters $D$ must be exponentially larger than the VC dimension $d$, even if the reduction is only slightly non-trivial. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness, as seen in techniques like random initialization. This result resolves an open question posed by Kamath, Montasser, and Srebro (COLT 2020). Our findings introduce new variants of \emph{dimension complexity} (also known as \emph{sign-rank}), a well-studied parameter in learning and complexity theory. Specifically, we define an approximate version of sign-rank and another variant that captures the minimum dimension required for a reduction to SCO. We also propose several open questions and directions for future research.

On Reductions and Representations of Learning Problems in Euclidean Spaces

TL;DR

Abstract

needed to reduce a concept class with VC dimension

to a Stochastic Convex Optimization (SCO) problem in

, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class. To achieve this, we develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. Perhaps surprisingly, we show that, in some cases, the number of parameters

must be exponentially larger than the VC dimension

, even if the reduction is only slightly non-trivial. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness, as seen in techniques like random initialization. This result resolves an open question posed by Kamath, Montasser, and Srebro (COLT 2020). Our findings introduce new variants of \emph{dimension complexity} (also known as \emph{sign-rank}), a well-studied parameter in learning and complexity theory. Specifically, we define an approximate version of sign-rank and another variant that captures the minimum dimension required for a reduction to SCO. We also propose several open questions and directions for future research.

On Reductions and Representations of Learning Problems in Euclidean Spaces

TL;DR

Abstract

On Reductions and Representations of Learning Problems in Euclidean Spaces

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (67)