Learning Attributed Graphlets: Predictive Graph Mining by Graphlets with Trainable Attribute
Tajima Shinji, Ren Sugihara, Ryota Kitahara, Masayuki Karasuyama
TL;DR
This work introduces LAGRA, an interpretable graph classification framework that jointly learns a sparse set of attributed graphlets (AGs) and their node attributes. Each AG contributes to prediction via an AG inclusion score that combines structural containment and attribute-driven similarity, enabling linear, interpretable decision boundaries: $f(G)=\beta_0+\sum_H\beta_H\psi(G;H)$. A novel optimization strategy blends proximal gradient descent with graph mining-based pruning to handle a combinatorially large AG space, ensuring identical quality to unpruned solutions while dramatically reducing computation. Empirical results on standard graph benchmarks show LAGRA achieving competitive accuracy with only a small number of AGs, and qualitative analyses illustrate the interpretability of learned graphlets and their attributes. Overall, LAGRA advances interpretable predictive graph mining by exhaustively exploring subgraph structures up to a limit while maintaining tractable optimization through principled pruning.
Abstract
The graph classification problem has been widely studied; however, achieving an interpretable model with high predictive performance remains a challenging issue. This paper proposes an interpretable classification algorithm for attributed graph data, called LAGRA (Learning Attributed GRAphlets). LAGRA learns importance weights for small attributed subgraphs, called attributed graphlets (AGs), while simultaneously optimizing their attribute vectors. This enables us to obtain a combination of subgraph structures and their attribute vectors that strongly contribute to discriminating different classes. A significant characteristics of LAGRA is that all the subgraph structures in the training dataset can be considered as a candidate structures of AGs. This approach can explore all the potentially important subgraphs exhaustively, but obviously, a naive implementation can require a large amount of computations. To mitigate this issue, we propose an efficient pruning strategy by combining the proximal gradient descent and a graph mining tree search. Our pruning strategy can ensure that the quality of the solution is maintained compared to the result without pruning. We empirically demonstrate that LAGRA has superior or comparable prediction performance to the standard existing algorithms including graph neural networks, while using only a small number of AGs in an interpretable manner.
