Predicting The Cop Number Using Machine Learning
Meagan Mann, Christian Muise, Erin Meger
TL;DR
This work tackles predicting the cop number $c(G)$ of graphs, a computationally hard quantity to determine exactly, by comparing classical machine learning on handcrafted graph invariants with graph neural networks that learn from topology alone. Using an extensive dataset of small graphs (2–13 vertices) with exact cop numbers and a broad set of invariants, the authors show that tree-based methods (e.g., HistGradientBoosting) achieve near-perfect accuracy (~0.97) on small graphs, while a Graph Isomorphism Network attains ~0.95 accuracy without handcrafted features. Interpretability analyses (SHAP and permutation importance) reveal that connectivity, density, and clique/treewidth-related features are the strongest predictors, aligning with theoretical insights about pursuit–evasion on graphs. Overall, ML approaches provide scalable, informative approximations that complement exact cop-number algorithms and offer structural insights into what makes graphs more or less cop-win.
Abstract
Cops and Robbers is a pursuit evasion game played on a graph, first introduced independently by Quilliot \cite{quilliot1978jeux} and Nowakowski and Winkler \cite{NOWAKOWSKI1983235} over four decades ago. A main interest in recent the literature is identifying the cop number of graph families. The cop number of a graph, $c(G)$, is defined as the minimum number of cops required to guarantee capture of the robber. Determining the cop number is computationally difficult and exact algorithms for this are typically restricted to small graph families. This paper investigates whether classical machine learning methods and graph neural networks can accurately predict a graph's cop number from its structural properties and identify which properties most strongly influence this prediction. Of the classical machine learning models, tree-based models achieve high accuracy in prediction despite class imbalance, whereas graph neural networks achieve comparable results without explicit feature engineering. The interpretability analysis shows that the most predictive features are related to node connectivity, clustering, clique structure, and width parameters, which aligns with known theoretical results. Our findings suggest that machine learning approaches can be used in complement with existing cop number algorithms by offering scalable approximations where computation is infeasible.
