Exact VC-Dimensions of Certain Geometric Set Systems
Pantelis E. Eleftheriou, Aris Papadopoulos, Francis Westhead
TL;DR
This work resolves the exact VC-dimensions for the 2- and 3-fold unions of the plane's lines, showing $\mathsf{VC}\text{-}\mathsf{dim}(\mathcal{F}_2)=5$ and $\mathsf{VC}\text{-}\mathsf{dim}(\mathcal{F}_3)=9$, and provides complete classifications of maximal shatterable point sets up to shatter-isomorphism (two types for size 5 and five types for size 9). The authors introduce shatter-isomorphism to compare set-systems by incidence structure and axiomatize the necessary and sufficient conditions for shattering, enabling precise counting of maximal shatterable configurations. They further quantify the maximal-shattering sequence, establishing $s_2(\mathcal{F}_1)=2$ and $s_3(\mathcal{F}_1)=5$, with explicit representatives. Finally, the results are extended to higher dimensions, proving that VC-dimension results for $\mathcal{F}_k$ transfer to analogues involving affine subspaces via a reduction to incidence-structure isomorphism.
Abstract
The VC-dimension of a family of sets is a measure of its combinatorial complexity used in machine learning theory, computational geometry, and even model theory. Computing the VC-dimension of the $k$-fold union of geometric set systems has been an open and difficult combinatorial problem, dating back to Blumer, Ehrenfeucht, Haussler, and Warmuth in 1989, who ask about the VC-dimension of $k$-fold unions of half-spaces in $\mathbb{R}^d$. Let $\mathcal{F}_1$ denote the family of all lines in $\mathbb{R}^2$. It is well-known that $\mathsf{VC}\text{-}\mathsf{dim}(\mathcal{F}_1) = 2$. In this paper, we study the $2$-fold and $3$-fold unions of $\mathcal{F}_1$, denoted $\mathcal{F}_2$ and $\mathcal{F}_3$, respectively. We show that $\mathsf{VC}\text{-}\mathsf{dim}(\mathcal{F}_2) = 5$ and $\mathsf{VC}\text{-}\mathsf{dim}(\mathcal{F}_3) = 9$. Moreover, we give complete characterisations of the subsets of $\mathbb{R}^2$ of maximal size that can be shattered by $\mathcal{F}_2$ and $\mathcal{F}_3$, showing they are exactly two and five, respectively, up to isomorphism in the language of the point-line incidence relation.
