Is uniform expressivity too restrictive? Towards efficient expressivity of graph neural networks
Sammy Khalife, Josué Tonelli-Cueto
TL;DR
This paper investigates the expressivity landscape of Graph Neural Networks (GNNs) with respect to GC2 queries, contrasting uniform expressivity (size independent of graph size) with non-uniform regimes. It proves that uniform expressivity is impossible for wide Pfaffian activations such as Sigmoid and Tanh, by constructing GC2 queries that cannot be uniformly expressed; it then shows that a form of almost-uniform expressivity is achievable using step-like activations, yielding network sizes that grow only polylogarithmically with graph degree ($O(d\,\log\Delta)$ or $O(d\,\log\log\Delta)$). The authors provide constructive proofs leveraging step-like convergence to a linear threshold and validate the theory with numerical experiments that illustrate the practical viability of non-uniform expressivity. Overall, the work clarifies the trade-off between uniform guarantees and practical expressivity, offering provable bounds and empirical support for efficient non-uniform GNN expressivity on large graphs.
Abstract
Uniform expressivity guarantees that a Graph Neural Network (GNN) can express a query without the parameters depending on the size of the input graphs. This property is desirable in applications in order to have number of trainable parameters that is independent of the size of the input graphs. Uniform expressivity of the two variable guarded fragment (GC2) of first order logic is a well-celebrated result for Rectified Linear Unit (ReLU) GNNs [Barcelo & al., 2020]. In this article, we prove that uniform expressivity of GC2 queries is not possible for GNNs with a wide class of Pfaffian activation functions (including the sigmoid and tanh), answering a question formulated by [Grohe, 2021]. We also show that despite these limitations, many of those GNNs can still efficiently express GC2 queries in a way that the number of parameters remains logarithmic on the maximal degree of the input graphs. Furthermore, we demonstrate that a log-log dependency on the degree is achievable for a certain choice of activation function. This shows that uniform expressivity can be successfully relaxed by covering large graphs appearing in practical applications. Our experiments illustrates that our theoretical estimates hold in practice.
