Table of Contents
Fetching ...

The Weisfeiler-Leman Dimension of Conjunctive Queries

Andreas Göbel, Leslie Ann Goldberg, Marc Roth

TL;DR

This work determines the WL-dimension for graph parameters that count the number of answers to fixed conjunctive queries (CQs). It introduces semantic extension width sew(H,X) as the minimal extension width among all counting-equivalent CQs to (H,X) and proves that the WL-dimension of any connected CQ with at least one free variable equals sew(H,X). The authors extend known results for full CQs (where sew equals treewidth) to all CQs, relate this to higher-order Graph Neural Networks, and derive consequences for the parameterized complexity of counting CQ answers, as well as for linear combinations of CQs (quantum queries) and star/dominating-set queries. The results provide a precise bridge between logical counting, WL indistinguishability, GNN expressiveness, and computational complexity, with practical implications for understanding which graph patterns can be counted under WL/GNN-based paradigms. In particular, they show acyclic CQs can have unbounded WL-dimension, while k-star queries yield sew=k, and they establish that counting size-k dominating sets has WL-dimension exactly k.

Abstract

The Weisfeiler-Leman (WL) dimension of a graph parameter $f$ is the minimum $k$ such that, if $G_1$ and $G_2$ are indistinguishable by the $k$-dimensional WL-algorithm then $f(G_1)=f(G_2)$. The WL-dimension of $f$ is $\infty$ if no such $k$ exists. We study the WL-dimension of graph parameters characterised by the number of answers from a fixed conjunctive query to the graph. Given a conjunctive query $\varphi$, we quantify the WL-dimension of the function that maps every graph $G$ to the number of answers of $\varphi$ in $G$. The works of Dvorák (J. Graph Theory 2010), Dell, Grohe, and Rattan (ICALP 2018), and Neuen (ArXiv 2023) have answered this question for full conjunctive queries, which are conjunctive queries without existentially quantified variables. For such queries $\varphi$, the WL-dimension is equal to the treewidth of the Gaifman graph of $\varphi$. In this work, we give a characterisation that applies to all conjunctive qureies. Given any conjunctive query $\varphi$, we prove that its WL-dimension is equal to the semantic extension width $\mathsf{sew}(\varphi)$, a novel width measure that can be thought of as a combination of the treewidth of $\varphi$ and its quantified star size, an invariant introduced by Durand and Mengel (ICDT 2013) describing how the existentially quantified variables of $\varphi$ are connected with the free variables. Using the recently established equivalence between the WL-algorithm and higher-order Graph Neural Networks (GNNs) due to Morris et al. (AAAI 2019), we obtain as a consequence that the function counting answers to a conjunctive query $\varphi$ cannot be computed by GNNs of order smaller than $\mathsf{sew}(\varphi)$.

The Weisfeiler-Leman Dimension of Conjunctive Queries

TL;DR

This work determines the WL-dimension for graph parameters that count the number of answers to fixed conjunctive queries (CQs). It introduces semantic extension width sew(H,X) as the minimal extension width among all counting-equivalent CQs to (H,X) and proves that the WL-dimension of any connected CQ with at least one free variable equals sew(H,X). The authors extend known results for full CQs (where sew equals treewidth) to all CQs, relate this to higher-order Graph Neural Networks, and derive consequences for the parameterized complexity of counting CQ answers, as well as for linear combinations of CQs (quantum queries) and star/dominating-set queries. The results provide a precise bridge between logical counting, WL indistinguishability, GNN expressiveness, and computational complexity, with practical implications for understanding which graph patterns can be counted under WL/GNN-based paradigms. In particular, they show acyclic CQs can have unbounded WL-dimension, while k-star queries yield sew=k, and they establish that counting size-k dominating sets has WL-dimension exactly k.

Abstract

The Weisfeiler-Leman (WL) dimension of a graph parameter is the minimum such that, if and are indistinguishable by the -dimensional WL-algorithm then . The WL-dimension of is if no such exists. We study the WL-dimension of graph parameters characterised by the number of answers from a fixed conjunctive query to the graph. Given a conjunctive query , we quantify the WL-dimension of the function that maps every graph to the number of answers of in . The works of Dvorák (J. Graph Theory 2010), Dell, Grohe, and Rattan (ICALP 2018), and Neuen (ArXiv 2023) have answered this question for full conjunctive queries, which are conjunctive queries without existentially quantified variables. For such queries , the WL-dimension is equal to the treewidth of the Gaifman graph of . In this work, we give a characterisation that applies to all conjunctive qureies. Given any conjunctive query , we prove that its WL-dimension is equal to the semantic extension width , a novel width measure that can be thought of as a combination of the treewidth of and its quantified star size, an invariant introduced by Durand and Mengel (ICDT 2013) describing how the existentially quantified variables of are connected with the free variables. Using the recently established equivalence between the WL-algorithm and higher-order Graph Neural Networks (GNNs) due to Morris et al. (AAAI 2019), we obtain as a consequence that the function counting answers to a conjunctive query cannot be computed by GNNs of order smaller than .
Paper Structure (22 sections, 40 theorems, 52 equations, 4 figures)

This paper contains 22 sections, 40 theorems, 52 equations, 4 figures.

Key Result

Theorem 1

Let $(H,X)$ be a connected conjunctive query with $X\neq \emptyset$. Then the WL-dimension of $(H,X)$ is equal to its semantic extension width $\mathsf{sew}(H,X)$.

Figures (4)

  • Figure 1: Each homomorphism $h$ from $H$ to $G$ induces a homomorphism $\tau$ from $H$ to $F$ by composing $h$ with the $F$-colouring $c$ of $G$. By partitioning $\mathrm{Hom}(H,G)$ along the induced homomorphisms to $F$, we obtain Observation \ref{['obs:partition']}.
  • Figure 2: Illustration for the proof of Lemma \ref{['lem:cloning_property']}: $G'=\mathcal{G}(G,F,c,\Vec{v}\,\Vec{z})$ is the graph obtained from $G$ by cloning vertices (Definition \ref{['def:cloning_operation']}), and $\rho$ is the homomorphism from $G'$ to $G$ that maps each cloned vertex in $G'$ to its primal vertex in $G$. Moreover, $c$ is the $F$-colouring of $G$ and $c'=\mathcal{C}(G,F,c,\Vec{v},vz)$ is, by Definition \ref{['def:cloning_operation']}, the composition of $c$ and $\rho$, i.e., each cloned vertex is mapped by $c'$ to the colour of its primal vertex.
  • Figure 3: Illustrations of the mappings used in Definition \ref{['def:partition_cq']}.
  • Figure 4: Overview of the mappings and homomorphisms used in the proof of Lemma \ref{['lem:cloning_property_projections']}.

Theorems & Definitions (92)

  • Theorem 1
  • Corollary 2
  • Proposition 3
  • Corollary 3
  • Corollary 3
  • Corollary 4
  • Definition 5
  • Definition 6
  • Definition 7: Counting Equivalence and Counting Minimality
  • Definition 8
  • ...and 82 more