On the Hölder Stability of Multiset and Graph Neural Networks

Yair Davidson; Nadav Dym

On the Hölder Stability of Multiset and Graph Neural Networks

Yair Davidson, Nadav Dym

TL;DR

This paper introduces a probabilistic framework, Hölder in expectation, to quantify pairwise separation quality in permutation-invariant models for multisets and graphs, addressing limitations of traditional separation guarantees. It develops four multiset embeddings (ReLU-sum, adaptive ReLU, smooth-sum, and sort-based) and analyzes their expected Hölder stability, yielding exponents such as $\alpha=\frac{p+1}{p}$ for ReLU-based sums and $\alpha=n$ for smooth activations, with sort-based embeddings achieving bi-Lipschitz stability in expectation. Extending to MPNNs, the authors derive upper-Lipschitz in expectation guarantees for all four architectures while showing that only SortMPNN can attain robust lower-Hölder stability at width 1; adversarial $\epsilon$-tree constructions illustrate depth-driven degradation for ReluMPNN and SmoothMPNN, whereas SortMPNN remains stable. Empirically, SortMPNN and AdaptMPNN demonstrate strong performance on adversarial and standard graph benchmarks (including TU datasets, LRGB, and Zinc12K), often outperforming traditional MPNNs and providing resilience to smaller model budgets. The work highlights the importance of separation quality under finite precision and offers practical, theoretically grounded MPNN designs with improved stability and robustness.

Abstract

Extensive research efforts have been put into characterizing and constructing maximally separating multiset and graph neural networks. However, recent empirical evidence suggests the notion of separation itself doesn't capture several interesting phenomena. On the one hand, the quality of this separation may be very weak, to the extent that the embeddings of "separable" objects might even be considered identical when using fixed finite precision. On the other hand, architectures which aren't capable of separation in theory, somehow achieve separation when taking the network to be wide enough. In this work, we address both of these issues, by proposing a novel pair-wise separation quality analysis framework which is based on an adaptation of Lipschitz and \Holder{} stability to parametric functions. The proposed framework, which we name \emph{\Holder{} in expectation}, allows for separation quality analysis, without restricting the analysis to embeddings that can separate all the input space simultaneously. We prove that common sum-based models are lower-\Holder{} in expectation, with an exponent that decays rapidly with the network's depth . Our analysis leads to adversarial examples of graphs which can be separated by three 1-WL iterations, but cannot be separated in practice by standard maximally powerful Message Passing Neural Networks (MPNNs). To remedy this, we propose two novel MPNNs with improved separation quality, one of which is lower Lipschitz in expectation. We show these MPNNs can easily classify our adversarial examples, and compare favorably with standard MPNNs on standard graph learning tasks.

On the Hölder Stability of Multiset and Graph Neural Networks

TL;DR

for ReLU-based sums and

for smooth activations, with sort-based embeddings achieving bi-Lipschitz stability in expectation. Extending to MPNNs, the authors derive upper-Lipschitz in expectation guarantees for all four architectures while showing that only SortMPNN can attain robust lower-Hölder stability at width 1; adversarial

-tree constructions illustrate depth-driven degradation for ReluMPNN and SmoothMPNN, whereas SortMPNN remains stable. Empirically, SortMPNN and AdaptMPNN demonstrate strong performance on adversarial and standard graph benchmarks (including TU datasets, LRGB, and Zinc12K), often outperforming traditional MPNNs and providing resilience to smaller model budgets. The work highlights the importance of separation quality under finite precision and offers practical, theoretically grounded MPNN designs with improved stability and robustness.

Abstract

Paper Structure (67 sections, 25 theorems, 137 equations, 8 figures, 7 tables)

This paper contains 67 sections, 25 theorems, 137 equations, 8 figures, 7 tables.

Introduction
Main results: multisets
Main Results: MPNNs
Notation
General framework
Hölder stability for parametric functions
Multiset Hölder stability
ReLU summation
Proof idea: the $\pm \epsilon$ example.
Adaptive ReLU
Summation with smooth activation
Sort based
Summary
MPNN Hölder Stability
MPNNs and WL
...and 52 more sections

Key Result

Theorem 3.1

For $(n,d,p,\Omega,z)$ satisfying our standing assumptions, assume that $a\sim S^{d-1}$ and $b \sim[-B,B]$. Then $m_{ReLU}(\cdot;a,b)$ is uniformly Lipschitz. Moreover,

Figures (8)

Figure 1: Separation quality on a single adversarial multiset-pair constructed as described in appendix \ref{['app:adversarial_multiset_experiment']}.
Figure 2: $l_2$ vs. $W_2$ distance on multiple adversarial multiset-pairs. Results are in accordance with our theoretical results (see Table\ref{['tab:expnoent_bounds']})
Figure 3: $l_2$ vs. TMD on $\epsilon$-Trees. The targeted ReluMPNN and SmoothMPNN exponents deteriorate with depth, in accordance with our theory (see Table \ref{['tab:expnoent_bounds']})
Figure 4: Distortion of the 2-tuple embeddings as a function of input and embedding dimension. LC=linear combination, LTSum=linear transform and sum, CP=concat project, C=concatenation.
Figure 5: A pair of labeled graphs $G_1$ (top) and $G_2$ (bottom) used to prove that MPNNs with Hölder building blocks aren't necessarily Hölder
...and 3 more figures

Theorems & Definitions (45)

Definition 2.1
Theorem 3.1
Theorem 3.2
Theorem 3.3
Theorem 3.4
Theorem 4.1
Theorem 4.2
Theorem 4.3
Lemma B.1
proof
...and 35 more

On the Hölder Stability of Multiset and Graph Neural Networks

TL;DR

Abstract

On the Hölder Stability of Multiset and Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (45)