Efficient Algorithms for Learning Monophonic Halfspaces in Graphs
Marco Bressan, Emmanuel Esposito, Maximilian Thiessen
TL;DR
The paper studies learning monophonic halfspaces on graphs, introducing a polynomial-time consistency checker that reduces consistency testing to a family of 2-SAT problems and enabling near-optimal passive learning with sample complexity scaling as $Oigl(( ext{$oldsymbol{ otatebox{90}{ m w}}$(G)) ext{log}(1/ ext{e})+ ext{log}(1/ ext{d})}{ ext{e}}igr)$ in the realizable PAC setting. It also delivers agnostic PAC results with $| ext{$ extsc{H}_{ extsc{mp}}(G)$}| ext{poly}(n)$ runtime and a refined bound $| ext{$ extsc{H}_{ extsc{mp}}(G)$}|\le rac{4m 2^{ ext{$oldsymbol{ otatebox{90}{ m w}}$(G)}}}{ ext{$oldsymbol{ otatebox{90}{ m w}}$(G)}}+2$, plus an ERM enumeration approach. In active learning, the authors achieve polynomial-time learning with query complexity $Oigl(h(G)+ ext{log diam}_g(G)+ ext{$oldsymbol{ otatebox{90}{ m w}}$(G)}igr)$, aided by a minimum monophonic hull and shadow-based inferences. In online learning, realizable learning yields mistakes $Oigl( ext{$oldsymbol{ otatebox{90}{ m w}}$(G)) ext{log} nigr)$ via Winnow, and a Halving-based bound $Oigl( ext{$oldsymbol{ otatebox{90}{ m w}}$(G)}+ ext{log}(n/ ext{$oldsymbol{ otatebox{90}{ m w}}$(G)))igr)$; the paper also explores agnostic online guarantees. Together, these results resolve open questions about learning m-halfspaces, provide sharp structural insights (e.g., shadow decompositions), and contrast favorably with NP-hardness results for geodesic halfspaces.
Abstract
We study the problem of learning a binary classifier on the vertices of a graph. In particular, we consider classifiers given by monophonic halfspaces, partitions of the vertices that are convex in a certain abstract sense. Monophonic halfspaces, and related notions such as geodesic halfspaces,have recently attracted interest, and several connections have been drawn between their properties(e.g., their VC dimension) and the structure of the underlying graph $G$. We prove several novel results for learning monophonic halfspaces in the supervised, online, and active settings. Our main result is that a monophonic halfspace can be learned with near-optimal passive sample complexity in time polynomial in $n = |V(G)|$. This requires us to devise a polynomial-time algorithm for consistent hypothesis checking, based on several structural insights on monophonic halfspaces and on a reduction to $2$-satisfiability. We prove similar results for the online and active settings. We also show that the concept class can be enumerated with delay $\operatorname{poly}(n)$, and that empirical risk minimization can be performed in time $2^{ω(G)}\operatorname{poly}(n)$ where $ω(G)$ is the clique number of $G$. These results answer open questions from the literature (González et al., 2020), and show a contrast with geodesic halfspaces, for which some of the said problems are NP-hard (Seiffarth et al., 2023).
