Table of Contents
Fetching ...

Cooperative Graph Neural Networks

Ben Finkelshtein, Xingyue Huang, Michael Bronstein, İsmail İlkan Ceylan

TL;DR

Co-GNNs restructure graph neural computation as a multi-agent, dynamic message-passing process where each node chooses per-layer actions (listen, broadcast, isolate, or standard) via a dedicated action network ${\pi}$, while an environment network ${\eta}$ updates states based on these choices. This yields a learned, layer-dependent computational graph that can be directed and asynchronous, enabling flexible propagation of task-relevant information and mitigating issues like over-smoothing and long-range bottlenecks. The framework is shown to be more expressive than 1-WL in expectation due to stochastic action sampling, and theoretical results demonstrate how Co-GNNs can approximate complex functions of distant node features. Empirically, Co-GNNs improve over baselines on synthetic tasks and, notably, achieve state-of-the-art performance on heterophilic graphs, while providing insights into how actions and environment choices shape information flow and topological rewiring. Overall, Co-GNNs offer a principled approach to adaptively shape computation graphs for improved long-range reasoning and robustness across graph types, with potential extensions to directed and multi-relational graphs.

Abstract

Graph neural networks are popular architectures for graph machine learning, based on iterative computation of node representations of an input graph through a series of invariant transformations. A large class of graph neural networks follow a standard message-passing paradigm: at every layer, each node state is updated based on an aggregate of messages from its neighborhood. In this work, we propose a novel framework for training graph neural networks, where every node is viewed as a player that can choose to either 'listen', 'broadcast', 'listen and broadcast', or to 'isolate'. The standard message propagation scheme can then be viewed as a special case of this framework where every node 'listens and broadcasts' to all neighbors. Our approach offers a more flexible and dynamic message-passing paradigm, where each node can determine its own strategy based on their state, effectively exploring the graph topology while learning. We provide a theoretical analysis of the new message-passing scheme which is further supported by an extensive empirical analysis on a synthetic dataset and on real-world datasets.

Cooperative Graph Neural Networks

TL;DR

Co-GNNs restructure graph neural computation as a multi-agent, dynamic message-passing process where each node chooses per-layer actions (listen, broadcast, isolate, or standard) via a dedicated action network , while an environment network updates states based on these choices. This yields a learned, layer-dependent computational graph that can be directed and asynchronous, enabling flexible propagation of task-relevant information and mitigating issues like over-smoothing and long-range bottlenecks. The framework is shown to be more expressive than 1-WL in expectation due to stochastic action sampling, and theoretical results demonstrate how Co-GNNs can approximate complex functions of distant node features. Empirically, Co-GNNs improve over baselines on synthetic tasks and, notably, achieve state-of-the-art performance on heterophilic graphs, while providing insights into how actions and environment choices shape information flow and topological rewiring. Overall, Co-GNNs offer a principled approach to adaptively shape computation graphs for improved long-range reasoning and robustness across graph types, with potential extensions to directed and multi-relational graphs.

Abstract

Graph neural networks are popular architectures for graph machine learning, based on iterative computation of node representations of an input graph through a series of invariant transformations. A large class of graph neural networks follow a standard message-passing paradigm: at every layer, each node state is updated based on an aggregate of messages from its neighborhood. In this work, we propose a novel framework for training graph neural networks, where every node is viewed as a player that can choose to either 'listen', 'broadcast', 'listen and broadcast', or to 'isolate'. The standard message propagation scheme can then be viewed as a special case of this framework where every node 'listens and broadcasts' to all neighbors. Our approach offers a more flexible and dynamic message-passing paradigm, where each node can determine its own strategy based on their state, effectively exploring the graph topology while learning. We provide a theoretical analysis of the new message-passing scheme which is further supported by an extensive empirical analysis on a synthetic dataset and on real-world datasets.
Paper Structure (33 sections, 5 theorems, 23 equations, 14 figures, 14 tables)

This paper contains 33 sections, 5 theorems, 23 equations, 14 figures, 14 tables.

Key Result

Proposition 5.1

Let $G_1=(V_1,E_1,{\bm{X}}_1)$ and $G_2=(V_2,E_2,{\bm{X}}_2)$ be two non-isomorphic graphs. Then, for any threshold $0 < \delta < 1$, there exists a parametrization of a $\textsc{Co-GNN}\xspace$ architecture using sufficiently many layers $L$, satisfying $\mathbb{P}({\bm{z}}_{G_1}^{\left(L\right)} \

Figures (14)

  • Figure 1: Example information flow for nodes $u,v$. Top: information flow relative to $u$ across three layers. Node $u$ listens to every neighbor in the first layer, but only to $v$ in the second layer, and only to $s$ and $r$ in the last layer. Bottom: information flow relative to $v$ across three layers. The node $v$ listens only to $w$ in the first two layers, and only to $u$ in the last layer.
  • Figure 2: The input graph $H$ and its computation graphs $H^{(0)}$, $H^{(1)}$, $H^{(2)}$ that are a result of applying the actions: $\langle\textsc{L}\xspace,\textsc{L}\xspace,\textsc{S}\xspace\rangle$ for the node $u$; $\langle\textsc{S}\xspace,\textsc{S}\xspace,\textsc{L}\xspace\rangle$ for the nodes $v$ and $w$; $\langle\textsc{S}\xspace,\textsc{I}\xspace,\textsc{S}\xspace\rangle$ for the nodes $s$ and $r$; $\langle\textsc{S}\xspace,\textsc{S}\xspace,\textsc{S}\xspace\rangle$ for all other nodes.
  • Figure 3: RootNeighbors examples. Left: Example tree for RootNeighbors. Right: Example of an optimal directed subgraph over the input tree, where the nodes with a degree of 6 ($u$ and $v$) $\textsc{Broadcast}\xspace$, while other nodes $\textsc{Listen}\xspace$.
  • Figure 4: The ratio of directed edges that are kept on cora (as a homophilic dataset) and on roman-empire (as a heterophilic dataset) for each layer $0 \leq \ell <10$.
  • Figure 5: The 10-hop neighborhood at layer $\ell=4$
  • ...and 9 more figures

Theorems & Definitions (8)

  • Proposition 5.1
  • Proposition 5.2
  • Lemma 1.1
  • proof
  • Proposition \ref{prop:expresivity}
  • proof
  • Proposition \ref{prop:long-range}
  • proof