Table of Contents
Fetching ...

Covariant Compositional Networks For Learning Graphs

Risi Kondor, Hy Truong Son, Horace Pan, Brandon Anderson, Shubhendu Trivedi

TL;DR

This paper addresses the limited expressive power of permutation-invariant message passing in graph neural networks by introducing Covariant Compositional Networks (CCNs), a framework of covariant, part-based architectures (comp-nets) that transform under permutation groups via tensor representations.CCNs generalize CNN-like compositionality to graphs, enabling activations to covary according to representations of the symmetric group, and provide general tensor-aggregation rules that preserve covariance across hierarchical receptive fields.The authors formulate first-, second-, and higher-order covariant nodes, derive promotion, stacking, contraction, and adjacency-aware aggregation schemes, and show that standard MPNNs are special cases within this covariant framework.Empirical results on large-scale molecular datasets and graph-kernel benchmarks demonstrate competitive or superior performance for CCNs, illustrating the practical value of imposing steerable, permutation-covariant representations in graph learning.

Abstract

Most existing neural networks for learning graphs address permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this imposes a limitation on their representation power, and instead propose a new general architecture for representing objects consisting of a hierarchy of parts, which we call Covariant Compositional Networks (CCNs). Here, covariance means that the activation of each neuron must transform in a specific way under permutations, similarly to steerability in CNNs. We achieve covariance by making each activation transform according to a tensor representation of the permutation group, and derive the corresponding tensor aggregation rules that each neuron must implement. Experiments show that CCNs can outperform competing methods on standard graph learning benchmarks.

Covariant Compositional Networks For Learning Graphs

TL;DR

This paper addresses the limited expressive power of permutation-invariant message passing in graph neural networks by introducing Covariant Compositional Networks (CCNs), a framework of covariant, part-based architectures (comp-nets) that transform under permutation groups via tensor representations.CCNs generalize CNN-like compositionality to graphs, enabling activations to covary according to representations of the symmetric group, and provide general tensor-aggregation rules that preserve covariance across hierarchical receptive fields.The authors formulate first-, second-, and higher-order covariant nodes, derive promotion, stacking, contraction, and adjacency-aware aggregation schemes, and show that standard MPNNs are special cases within this covariant framework.Empirical results on large-scale molecular datasets and graph-kernel benchmarks demonstrate competitive or superior performance for CCNs, illustrating the practical value of imposing steerable, permutation-covariant representations in graph learning.

Abstract

Most existing neural networks for learning graphs address permutation invariance by conceiving of the network as a message passing scheme, where each node sums the feature vectors coming from its neighbors. We argue that this imposes a limitation on their representation power, and instead propose a new general architecture for representing objects consisting of a hierarchy of parts, which we call Covariant Compositional Networks (CCNs). Here, covariance means that the activation of each neuron must transform in a specific way under permutations, similarly to steerability in CNNs. We achieve covariance by making each activation transform according to a tensor representation of the permutation group, and derive the corresponding tensor aggregation rules that each neuron must implement. Experiments show that CCNs can outperform competing methods on standard graph learning benchmarks.

Paper Structure

This paper contains 21 sections, 7 theorems, 33 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Proposition 1

Let $\phi(\mathcal{G})$ be the output of a comp-net based on a composition scheme $\mathcal{M}$. Assume Then the overall representation $\phi(\mathcal{G})$ is invariant to permutations of the atoms. In particular, if $\mathcal{G}$ is a graph and the atoms are its vertices, then $\phi$ is a permutation invariant graph representation.

Figures (6)

  • Figure 1: (a) A small graph $G$ with 6 vertices and its adjacency matrix. (b) An alternative form $G'$ of the same graph, derived from $G$ by renumbering the vertices by a permutation $\sigma\colon\{1,2,\ldots,6\}\mapsto \{1,2,\ldots,6\}$. The adjacency matrices of $G$ and $G'$ are different, but topologically they represent the same graph. Therefore, we expect the feature map $\phi$ to satisfy $\phi(G)=\phi(G')$.
  • Figure 2: (a) A composition scheme for an object $\mathcal{G}$ is a DAG in which the leaves correspond to atoms, the internal nodes correspond to sets of atoms, and the root corresponds to the entire object. (b) A compositional network is a composition scheme in which each node $\mathfrak{n}_i$ also carries a feature vector $f_i$. The feature vector at $\mathfrak{n}_i$ is computed from the feature vectors of the children of $\mathfrak{n}_i$.
  • Figure 3: A minimal requirement for composition schemes is that they be invariant to permutation, i.e. that if the numbering of the atoms is changed by a permutation $\sigma$, then we must get an isomorphic DAG. Any node in the new DAG that corresponds to $\{e'_{i_1},\ldots,e'_{i_k}\}$ must have a corrresponding node in the old DAG corresponding to $\{e_{\sigma^{-1}(i_1)},\ldots,e_{\sigma^{-1}(i_k)}\}$.
  • Figure 4: In convolutional neural networks if the input image is translated by some amount $(t_1,t_2)$, what used to fall in the receptive field of neuron $\mathfrak{n}^\ell_{i,j}$ is moved to the receptive field of ${\mathfrak{n}^\ell_{i+t_1,j+t_2}}$. Therefore, the activations transform in the very simple way ${f'{}^\ell_{i+t_1,j+t_2}=f^\ell_{i,j}}$. In contrast, rotations not only move the receptive fields around, but also permute the neurons in the receptive field internally, therefore, in general, ${{f'}{}^\ell_{j,-i}\!\neq\! f^\ell_{i,j}}$. The right hand figure shows that if the CNN has a horizontal filter (blue) and a vertical one (red) then their activations are exchanged by a 90 degree rotation. In steerable CNNs, if $(i,j)\mapsto (i',j')$, then ${f'}{}^\ell_{i',j'}\space=\space R(f^\ell_{i,j})$ for some fixed linear function of the rotation.
  • Figure 5: Top left: At level $\ell\space=\space1$$\mathfrak{n}_3$ aggregates information from $\left\{\mathfrak{n}_4,\mathfrak{n}_5 \right\}$ and $\mathfrak{n}_2$ aggregates information $\{\mathfrak{n}_5,\mathfrak{n}_6\}$. At $\ell\space=\space2$, $\mathfrak{n}_1$ collects this summary information from $\mathfrak{n}_3$ and $\mathfrak{n}_2$. Bottom left: This graph is not isomorphic to the top one, but the activations of $\mathfrak{n}_3$ and $\mathfrak{n}_2$ at $\ell\space=\space1$ will be identical. Therefore, at $\ell\space=\space2$, $\mathfrak{n}_1$ will get the same inputs from its neighbors, irrespective of whether or not $\mathfrak{n}_5$ and $\mathfrak{n}_7$ are the same node or not. Right: Aggregation at different levels. For keeping the figure legible only the neighborhood around one node in higher levels is marked.
  • ...and 1 more figures

Theorems & Definitions (16)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Proposition 1
  • Proposition 2
  • Definition 5
  • Definition 6
  • Proposition 3
  • Definition 7
  • ...and 6 more