Table of Contents
Fetching ...

Exact Subgraph Isomorphism Network with Mixed $L_{0,2}$ Norm Constraint for Predictive Graph Mining

Taiga Kojima, Haruto Kajita, Ayato Kohara, Masayuki Karasuyama

TL;DR

EIN tackles graph-level prediction by integrating exact subgraph enumeration with a neural network, guided by a mixed $L_{0,2}$ sparsity constraint to select a small, interpretable set of predictive subgraphs. The Graph Mining Layer jointly learns subgraph representations through a linear aggregation over candidate subgraphs, while an iterative hard-thresholding optimization and a pruning scheme based on gradient upper bounds make the approach scalable. The paper provides convergence guarantees under standard smoothness and KL assumptions and demonstrates competitive accuracy on synthetic and real-world datasets, with strong post-hoc interpretability using SHAP, trees, and RF analyses. Combining exact subgraph information with neural models also enables flexible integration with Graph Neural Networks, improving discriminative power without sacrificing interpretability or tractability.

Abstract

In the graph-level prediction task (predict a label for a given graph), the information contained in subgraphs of the input graph plays a key role. In this paper, we propose Exact subgraph Isomorphism Network (EIN), which combines the exact subgraph enumeration, a neural network, and a sparse regularization by the mixed $L_{0,2}$ norm constraint. In general, building a graph-level prediction model achieving high discriminative ability along with interpretability is still a challenging problem. Our combination of the subgraph enumeration and neural network contributes to high discriminative ability about the subgraph structure of the input graph. Further, the sparse regularization in EIN enables us 1) to derive an effective pruning strategy that mitigates computational difficulty of the enumeration while maintaining the prediction performance, and 2) to identify important subgraphs that contributes to high interpretability. We empirically show that EIN has sufficiently high prediction performance compared with standard graph neural network models, and also, we show examples of post-hoc analysis based on the selected subgraphs.

Exact Subgraph Isomorphism Network with Mixed $L_{0,2}$ Norm Constraint for Predictive Graph Mining

TL;DR

EIN tackles graph-level prediction by integrating exact subgraph enumeration with a neural network, guided by a mixed sparsity constraint to select a small, interpretable set of predictive subgraphs. The Graph Mining Layer jointly learns subgraph representations through a linear aggregation over candidate subgraphs, while an iterative hard-thresholding optimization and a pruning scheme based on gradient upper bounds make the approach scalable. The paper provides convergence guarantees under standard smoothness and KL assumptions and demonstrates competitive accuracy on synthetic and real-world datasets, with strong post-hoc interpretability using SHAP, trees, and RF analyses. Combining exact subgraph information with neural models also enables flexible integration with Graph Neural Networks, improving discriminative power without sacrificing interpretability or tractability.

Abstract

In the graph-level prediction task (predict a label for a given graph), the information contained in subgraphs of the input graph plays a key role. In this paper, we propose Exact subgraph Isomorphism Network (EIN), which combines the exact subgraph enumeration, a neural network, and a sparse regularization by the mixed norm constraint. In general, building a graph-level prediction model achieving high discriminative ability along with interpretability is still a challenging problem. Our combination of the subgraph enumeration and neural network contributes to high discriminative ability about the subgraph structure of the input graph. Further, the sparse regularization in EIN enables us 1) to derive an effective pruning strategy that mitigates computational difficulty of the enumeration while maintaining the prediction performance, and 2) to identify important subgraphs that contributes to high interpretability. We empirically show that EIN has sufficiently high prediction performance compared with standard graph neural network models, and also, we show examples of post-hoc analysis based on the selected subgraphs.

Paper Structure

This paper contains 33 sections, 9 theorems, 107 equations, 7 figures, 5 tables, 2 algorithms.

Key Result

Theorem 2.1

Let $H^\prime \sqsupseteq H$ and $\delta^t_{ik} = {\frac{\partial \mathrm{loss}(y_i, f(G_i) ; \boldsymbol{B}^t, \boldsymbol{b}^t, \boldsymbol{\theta}^t)}{\partial h_k}}$. Then,

Figures (7)

  • Figure 1: Overview of proposed method.
  • Figure 2: A simple example of EIN combined with GNN.
  • Figure 3: Subgraphs (a) and (b) have closed paths whose lengths are 8 and 9, respectively, which are difficult to discriminate. Subgraph (c) is used for adjusting the number of nodes that have label '1'. Note that all subgraphs (a), (b), and (c) have 16 nodes.
  • Figure 4: Example of SHAP applied to an EIN prediction from the ToxCast dataset.
  • Figure 5: Decision tree on Cycle_XOR
  • ...and 2 more figures

Theorems & Definitions (19)

  • Theorem 2.1
  • Corollary 2.1
  • Remark 2.1
  • Remark 2.2
  • Definition 2.1
  • Lemma 2.1
  • Theorem 2.2
  • Theorem 2.3
  • proof
  • Lemma G.1
  • ...and 9 more