Sample compression schemes for balls in graphs
Jérémie Chalopin, Victor Chepoi, Fionn Mc Inerney, Sébastien Ratel, Yann Vaxès
TL;DR
The paper investigates sample compression schemes for the family of balls in graphs, linking the VC-dimension framework to concrete, graph-class-specific constructions. It delivers explicit, small-size proper labeled and unlabeled schemes across several graph families (trees, cycles, interval graphs, cube-free median graphs, split graphs, planar graphs, and hyperbolic graphs), including radius-restricted variants and approximate schemes in δ-hyperbolic settings. Key contributions include USCS/LSCS of size 2 for metric trees and size 3–6 for cycles and trees of cycles, 4 for interval graphs, 22 for cube-free median graphs, and 4 for radius-1 planar graphs, along with a general approximate framework for δ-hyperbolic graphs. These results advance understanding of the sample compression conjecture in a broad graph-theoretic context and offer practical compression schemes tailored to specific geometric structures, with several open questions on optimality and extensions to additional graph classes.
Abstract
One of the open problems in machine learning is whether any set-family of VC-dimension $d$ admits a sample compression scheme of size $O(d)$. In this paper, we study this problem for balls in graphs. For a ball $B=B_r(x)$ of a graph $G=(V,E)$, a realizable sample for $B$ is a signed subset $X=(X^+,X^-)$ of $V$ such that $B$ contains $X^+$ and is disjoint from $X^-$. A proper sample compression scheme of size $k$ consists of a compressor and a reconstructor. The compressor maps any realizable sample $X$ to a subsample $X'$ of size at most $k$. The reconstructor maps each such subsample $X'$ to a ball $B'$ of $G$ such that $B'$ includes $X^+$ and is disjoint from $X^-$. For balls of arbitrary radius $r$, we design proper labeled sample compression schemes of size $2$ for trees, of size $3$ for cycles, of size $4$ for interval graphs, of size $6$ for trees of cycles, and of size $22$ for cube-free median graphs. For balls of a given radius, we design proper labeled sample compression schemes of size $2$ for trees and of size $4$ for interval graphs. We also design approximate sample compression schemes of size 2 for balls of $δ$-hyperbolic graphs.
