Table of Contents
Fetching ...

Practical Computation of Graph VC-Dimension

David Coudert, Mónika Csikós, Guillaume Ducoffe, Laurent Viennot

TL;DR

This work tackles the practical computation of the graph $\mathrm{VCdim}$, defined as the VC-dimension of the closed-neighborhood set system $\{N_G[v]:v\in V\}$ in a graph $G$. It presents a practical exact algorithm that incrementally tightens a lower bound on $\mathrm{VCdim}$ and prunes the search using degree-based bounds, traces via bitmasks, and partition-refinement-based reductions, achieving computation on graphs with millions of nodes where $\mathrm{VCdim}$ typically lies in the small range $3$ to $8$. The authors prove $W[1]$-hardness of the problem under the natural parameterization by the dimension, and they derive several sharp, linear bounds relating $\mathrm{VCdim}$ to standard graph parameters like maximum degree, degeneracy, and matching number. Empirically, the method performs well across diverse real networks and synthetic graphs, with a publicly available implementation, and the experiments reveal insightful patterns for random graphs and power-law networks regarding the onset of high VC-dimension. Overall, the paper provides both theoretical hardness results and a practical toolkit for estimating and computing graph VC-dimension, highlighting its potential as a useful graph parameter in algorithm design and analysis.

Abstract

For any set system $H=(V,R), \ R \subseteq 2^V$, a subset $S \subseteq V$ is called \emph{shattered} if every $S' \subseteq S$ results from the intersection of $S$ with some set in $\R$. The \emph{VC-dimension} of $H$ is the size of a largest shattered set in $V$. In this paper, we focus on the problem of computing the VC-dimension of graphs. In particular, given a graph $G=(V,E)$, the VC-dimension of $G$ is defined as the VC-dimension of $(V, \mathcal N)$, where $\mathcal N$ contains each subset of $V$ that can be obtained as the closed neighborhood of some vertex $v \in V$ in $G$. Our main contribution is an algorithm for computing the VC-dimension of any graph, whose effectiveness is shown through experiments on various types of practical graphs, including graphs with millions of vertices. A key aspect of its efficiency resides in the fact that practical graphs have small VC-dimension, up to 8 in our experiments. As a side-product, we present several new bounds relating the graph VC-dimension to other classical graph theoretical notions. We also establish the $W[1]$-hardness of the graph VC-dimension problem by extending a previous result for arbitrary set systems.

Practical Computation of Graph VC-Dimension

TL;DR

This work tackles the practical computation of the graph , defined as the VC-dimension of the closed-neighborhood set system in a graph . It presents a practical exact algorithm that incrementally tightens a lower bound on and prunes the search using degree-based bounds, traces via bitmasks, and partition-refinement-based reductions, achieving computation on graphs with millions of nodes where typically lies in the small range to . The authors prove -hardness of the problem under the natural parameterization by the dimension, and they derive several sharp, linear bounds relating to standard graph parameters like maximum degree, degeneracy, and matching number. Empirically, the method performs well across diverse real networks and synthetic graphs, with a publicly available implementation, and the experiments reveal insightful patterns for random graphs and power-law networks regarding the onset of high VC-dimension. Overall, the paper provides both theoretical hardness results and a practical toolkit for estimating and computing graph VC-dimension, highlighting its potential as a useful graph parameter in algorithm design and analysis.

Abstract

For any set system , a subset is called \emph{shattered} if every results from the intersection of with some set in . The \emph{VC-dimension} of is the size of a largest shattered set in . In this paper, we focus on the problem of computing the VC-dimension of graphs. In particular, given a graph , the VC-dimension of is defined as the VC-dimension of , where contains each subset of that can be obtained as the closed neighborhood of some vertex in . Our main contribution is an algorithm for computing the VC-dimension of any graph, whose effectiveness is shown through experiments on various types of practical graphs, including graphs with millions of vertices. A key aspect of its efficiency resides in the fact that practical graphs have small VC-dimension, up to 8 in our experiments. As a side-product, we present several new bounds relating the graph VC-dimension to other classical graph theoretical notions. We also establish the -hardness of the graph VC-dimension problem by extending a previous result for arbitrary set systems.
Paper Structure (21 sections, 8 theorems, 3 equations, 4 figures, 5 tables, 2 algorithms)

This paper contains 21 sections, 8 theorems, 3 equations, 4 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

For any graph $G$ and parameter $k \leq |V(G)|$, there exists a graph $H_G$ such that $G$ contains a $k$-clique if and only if the VC-dimension of $H_G$ is at least $k$. Furthermore, we can construct $H_G$ from $G$ in $\mathcal{O}\xspace(k2^kn^2)$ time.

Figures (4)

  • Figure 1: Computation time $t$ in seconds versus the estimated number $x$ of tentative shattered sets considered by $KBG$: each network in the dataset is represented by a disk with coordinates $(x,t)$, whose color indicates the VC-dimension $d$ of the network, while the size is proportional to the logarithm of the number of high degree nodes.
  • Figure 2: The number $y$ of visited shattered sets versus the number $x$ of shattered sets in $H'$: each network in the dataset is represented by a disk with coordinates $(x,y)$, whose color indicates the VC-dimension $d$ of the network, while its size is proportional to the logarithm of $|H'|$ ($H'$ denotes the set of nodes with degree $2^d$ at least).
  • Figure 3: Top: the average VC-dimension of $G_{n,p}$ as a function of $p$ for $n=32,45,64,100, 128$. Bottom: a zoom on values $p\in[0,0.3]$ including additional curves for $n=256,400$.
  • Figure 4: Average VC-dimension of a power-law random graph with respect to the exponent $\beta$ of the law for various numbers $n$ of nodes. (Curves for large $n$ are truncated for low values of $\beta$.)

Theorems & Definitions (15)

  • Theorem 1
  • Lemma 1
  • proof
  • Corollary 1
  • Lemma 2
  • Lemma 3
  • proof
  • proof
  • Theorem 2: Theorem 3 in downey1993parameterized
  • Claim
  • ...and 5 more