Table of Contents
Fetching ...

Learning Convolutional Neural Networks for Graphs

Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov

TL;DR

The paper extends convolutional neural networks to arbitrary graphs by introducing Patchy-san, a framework that builds fixed-size, normalized local neighborhoods as receptive fields for CNNs. It leverages graph labeling (e.g., Weisfeiler-Lehman) and canonicalization to align patches across graphs, enabling end-to-end learning with node/edge attributes. Experiments show Patchy-san achieves competitive accuracy with state-of-the-art graph kernels while offering scalable runtimes and useful feature visualizations. The approach broadens CNN applicability to complex graph-structured data, with potential for large-scale and multi-attribute graphs.

Abstract

Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.

Learning Convolutional Neural Networks for Graphs

TL;DR

The paper extends convolutional neural networks to arbitrary graphs by introducing Patchy-san, a framework that builds fixed-size, normalized local neighborhoods as receptive fields for CNNs. It leverages graph labeling (e.g., Weisfeiler-Lehman) and canonicalization to align patches across graphs, enabling end-to-end learning with node/edge attributes. Experiments show Patchy-san achieves competitive accuracy with state-of-the-art graph kernels while offering scalable runtimes and useful feature visualizations. The approach broadens CNN applicability to complex graph-structured data, with potential for large-scale and multi-attribute graphs.

Abstract

Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.

Paper Structure

This paper contains 16 sections, 4 theorems, 1 equation, 5 figures, 2 tables, 4 algorithms.

Key Result

Theorem 1

Optimal graph normalization is NP-hard.

Figures (5)

  • Figure 1: A CNN with a receptive field of size $3$x$3$. The field is moved over an image from left to right and top to bottom using a particular stride (here: 1) and zero-padding (here: none) (a). The values read by the receptive fields are transformed into a linear layer and fed to a convolutional architecture (b). The node sequence for which the receptive fields are created and the shapes of the receptive fields are fully determined by the hyper-parameters.
  • Figure 2: An illustration of the proposed architecture. A node sequence is selected from a graph via a graph labeling procedure. For some nodes in the sequence, a local neighborhood graph is assembled and normalized. The normalized neighborhoods are used as receptive fields and combined with existing CNN components.
  • Figure 3: The normalization is performed for each of the graphs induced on the neighborhood of a root node $v$ (the red node; node colors indicate distance to the root node). A graph labeling is used to rank the nodes and to create the normalized receptive fields, one of size $k$ (here: $k=9$) for node attributes and one of size $k\times k$ for edge attributes. Normalization also includes cropping of excess nodes and padding with dummy nodes. Each vertex (edge) attribute corresponds to an input channel with the respective receptive field.
  • Figure 4: Receptive fields per second rates on different graphs.
  • Figure 5: Visualization of RBM features learned with 1-dimensional WL normalized receptive fields of size $9$ for a torus (periodic lattice, top left), a preferential attachment graph (Barabasi:1999, bottom left), a co-purchasing network of political books (top right), and a random graph (bottom right). Instances of these graphs with about $100$ nodes are depicted on the left. A visual representation of the feature's weights (the darker a pixel, the stronger the corresponding weight) and $3$ graphs sampled from the RBMs by setting all but the hidden node corresponding to the feature to zero. Yellow nodes have position $1$ in the adjacency matrices. (Best seen in color.)

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4