Table of Contents
Fetching ...

Learning Attributed Graphlets: Predictive Graph Mining by Graphlets with Trainable Attribute

Tajima Shinji, Ren Sugihara, Ryota Kitahara, Masayuki Karasuyama

TL;DR

This work introduces LAGRA, an interpretable graph classification framework that jointly learns a sparse set of attributed graphlets (AGs) and their node attributes. Each AG contributes to prediction via an AG inclusion score that combines structural containment and attribute-driven similarity, enabling linear, interpretable decision boundaries: $f(G)=\beta_0+\sum_H\beta_H\psi(G;H)$. A novel optimization strategy blends proximal gradient descent with graph mining-based pruning to handle a combinatorially large AG space, ensuring identical quality to unpruned solutions while dramatically reducing computation. Empirical results on standard graph benchmarks show LAGRA achieving competitive accuracy with only a small number of AGs, and qualitative analyses illustrate the interpretability of learned graphlets and their attributes. Overall, LAGRA advances interpretable predictive graph mining by exhaustively exploring subgraph structures up to a limit while maintaining tractable optimization through principled pruning.

Abstract

The graph classification problem has been widely studied; however, achieving an interpretable model with high predictive performance remains a challenging issue. This paper proposes an interpretable classification algorithm for attributed graph data, called LAGRA (Learning Attributed GRAphlets). LAGRA learns importance weights for small attributed subgraphs, called attributed graphlets (AGs), while simultaneously optimizing their attribute vectors. This enables us to obtain a combination of subgraph structures and their attribute vectors that strongly contribute to discriminating different classes. A significant characteristics of LAGRA is that all the subgraph structures in the training dataset can be considered as a candidate structures of AGs. This approach can explore all the potentially important subgraphs exhaustively, but obviously, a naive implementation can require a large amount of computations. To mitigate this issue, we propose an efficient pruning strategy by combining the proximal gradient descent and a graph mining tree search. Our pruning strategy can ensure that the quality of the solution is maintained compared to the result without pruning. We empirically demonstrate that LAGRA has superior or comparable prediction performance to the standard existing algorithms including graph neural networks, while using only a small number of AGs in an interpretable manner.

Learning Attributed Graphlets: Predictive Graph Mining by Graphlets with Trainable Attribute

TL;DR

This work introduces LAGRA, an interpretable graph classification framework that jointly learns a sparse set of attributed graphlets (AGs) and their node attributes. Each AG contributes to prediction via an AG inclusion score that combines structural containment and attribute-driven similarity, enabling linear, interpretable decision boundaries: . A novel optimization strategy blends proximal gradient descent with graph mining-based pruning to handle a combinatorially large AG space, ensuring identical quality to unpruned solutions while dramatically reducing computation. Empirical results on standard graph benchmarks show LAGRA achieving competitive accuracy with only a small number of AGs, and qualitative analyses illustrate the interpretability of learned graphlets and their attributes. Overall, LAGRA advances interpretable predictive graph mining by exhaustively exploring subgraph structures up to a limit while maintaining tractable optimization through principled pruning.

Abstract

The graph classification problem has been widely studied; however, achieving an interpretable model with high predictive performance remains a challenging issue. This paper proposes an interpretable classification algorithm for attributed graph data, called LAGRA (Learning Attributed GRAphlets). LAGRA learns importance weights for small attributed subgraphs, called attributed graphlets (AGs), while simultaneously optimizing their attribute vectors. This enables us to obtain a combination of subgraph structures and their attribute vectors that strongly contribute to discriminating different classes. A significant characteristics of LAGRA is that all the subgraph structures in the training dataset can be considered as a candidate structures of AGs. This approach can explore all the potentially important subgraphs exhaustively, but obviously, a naive implementation can require a large amount of computations. To mitigate this issue, we propose an efficient pruning strategy by combining the proximal gradient descent and a graph mining tree search. Our pruning strategy can ensure that the quality of the solution is maintained compared to the result without pruning. We empirically demonstrate that LAGRA has superior or comparable prediction performance to the standard existing algorithms including graph neural networks, while using only a small number of AGs in an interpretable manner.
Paper Structure (21 sections, 1 theorem, 28 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 21 sections, 1 theorem, 28 equations, 11 figures, 3 tables, 2 algorithms.

Key Result

Theorem 2.1

Let $L(H^\prime) \sqsupseteq L(H)$ and $H, H^\prime \in \overline{{\mathcal{W}}}$. Then, where where ${\mathcal{I}} = \{i \mid 1 - y_i(\boldsymbol{\psi}_i^\top \boldsymbol{\beta} + \beta_0) > 0\}$.

Figures (11)

  • Figure 1: Illustration of our attributed graphlet (AG) based prediction model. The colors of each graph node represents a graph node label, and a bar plot associated with each graph node represents a trainable attribute vector.
  • Figure 2: An example of important attributed graphlets $H^+$ and $H^-$ identified by LAGRA in the AIDS dataset. $H^+$ and $H^-$ positively and negatively contribute to the prediction, respectively. The right plot is a scatter in which $x$- and $y$- axes are our graphlet features representing the how precisely $H^+$ and $H^-$ are included in the input graph $G_i$. Each point is from the test dataset.
  • Figure 3: Examples of matchings between a graph and AGs (colors of graph nodes are node labels). (a) For two AGs $H$ and $H^\prime$, $L(G_i)$ only contains $L(H)$, and $L(H^\prime)$ is not contained. Then, $\psi(G_i ; H) > 0$ and $\psi(G_i ; H^\prime) = 0$. (b) An example of the set of injections $M = \{ m, m^\prime \}$, where $m(1) = 2, m(2) = 3, m^\prime(1) = 1$, and $m^\prime(2) = 4$. The figure shows that $m$ and $m^\prime$ are label and edge preserving.
  • Figure 4: An example of training data, ${\mathcal{L}}$ and ${\mathcal{H}}$. Since ${\mathcal{L}}$ only includes subgraphs in the training data, "" is not included in ${\mathcal{L}}$. ${\mathcal{H}}$ is created from ${\mathcal{L}}$ by adding trainable attribute vectors $\boldsymbol{z}^{H_i}_{v}$ ($v \in V_{H_i}$).
  • Figure 5: An illustration of LAGRA. a) In the forward pass, only passes with $|\beta_H| > 0$ contribute to the output. AGIS is defined by the best matching between an input graph and an AG. b) For the backward pass, the gradient can be pruned when the rule \ref{['eq:pruning']} is satisfied. In this illustration, $H^{\prime\prime}$ is pruned by which graphs expanded from $H^{\prime\prime}$ are not required to compute the gradient.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Theorem 2.1