Table of Contents
Fetching ...

Graph Information Bottleneck for Subgraph Recognition

Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, Ran He

TL;DR

This paper tackles subgraph recognition by extending the information bottleneck to irregular graph data through Graph Information Bottleneck (GIB). It defines IB-subgraphs that maximize predictive information while minimizing extraneous graph information, using a graph-specific mutual information estimator within a bi-level optimization framework and a connectivity loss to stabilize training. Empirical results demonstrate that IB-subgraphs improve graph classification, provide interpretable substructures, and enhance denoising capabilities across multiple GNN backbones. The approach is model-agnostic and directly targets informative, compact graph substructures without requiring subgraph annotations.

Abstract

Given the input graph and its label/property, several key problems of graph learning, such as finding interpretable subgraphs, graph denoising and graph compression, can be attributed to the fundamental problem of recognizing a subgraph of the original one. This subgraph shall be as informative as possible, yet contains less redundant and noisy structure. This problem setting is closely related to the well-known information bottleneck (IB) principle, which, however, has less been studied for the irregular graph data and graph neural networks (GNNs). In this paper, we propose a framework of Graph Information Bottleneck (GIB) for the subgraph recognition problem in deep graph learning. Under this framework, one can recognize the maximally informative yet compressive subgraph, named IB-subgraph. However, the GIB objective is notoriously hard to optimize, mostly due to the intractability of the mutual information of irregular graph data and the unstable optimization process. In order to tackle these challenges, we propose: i) a GIB objective based-on a mutual information estimator for the irregular graph data; ii) a bi-level optimization scheme to maximize the GIB objective; iii) a connectivity loss to stabilize the optimization process. We evaluate the properties of the IB-subgraph in three application scenarios: improvement of graph classification, graph interpretation and graph denoising. Extensive experiments demonstrate that the information-theoretic IB-subgraph enjoys superior graph properties.

Graph Information Bottleneck for Subgraph Recognition

TL;DR

This paper tackles subgraph recognition by extending the information bottleneck to irregular graph data through Graph Information Bottleneck (GIB). It defines IB-subgraphs that maximize predictive information while minimizing extraneous graph information, using a graph-specific mutual information estimator within a bi-level optimization framework and a connectivity loss to stabilize training. Empirical results demonstrate that IB-subgraphs improve graph classification, provide interpretable substructures, and enhance denoising capabilities across multiple GNN backbones. The approach is model-agnostic and directly targets informative, compact graph substructures without requiring subgraph annotations.

Abstract

Given the input graph and its label/property, several key problems of graph learning, such as finding interpretable subgraphs, graph denoising and graph compression, can be attributed to the fundamental problem of recognizing a subgraph of the original one. This subgraph shall be as informative as possible, yet contains less redundant and noisy structure. This problem setting is closely related to the well-known information bottleneck (IB) principle, which, however, has less been studied for the irregular graph data and graph neural networks (GNNs). In this paper, we propose a framework of Graph Information Bottleneck (GIB) for the subgraph recognition problem in deep graph learning. Under this framework, one can recognize the maximally informative yet compressive subgraph, named IB-subgraph. However, the GIB objective is notoriously hard to optimize, mostly due to the intractability of the mutual information of irregular graph data and the unstable optimization process. In order to tackle these challenges, we propose: i) a GIB objective based-on a mutual information estimator for the irregular graph data; ii) a bi-level optimization scheme to maximize the GIB objective; iii) a connectivity loss to stabilize the optimization process. We evaluate the properties of the IB-subgraph in three application scenarios: improvement of graph classification, graph interpretation and graph denoising. Extensive experiments demonstrate that the information-theoretic IB-subgraph enjoys superior graph properties.

Paper Structure

This paper contains 20 sections, 18 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of the proposed graph information bottleneck (GIB) framework. We employ a bi-level optimization scheme to optimize the GIB objective and thus yielding the IB-subgraph. In the inner optimization phase, we estimate $I({\mathcal{G}},{\mathcal{G}}_{sub})$ by optimizing the statistics network of the DONSKER-VARADHAN representation dv-representation. Given a good estimation of $I({\mathcal{G}},{\mathcal{G}}_{sub})$, in the outer optimization phase, we maximize the GIB objective by optimizing the mutual information, the classification loss $\mathcal{L}_{cls}$ and connectivity loss $\mathcal{L}_{con}$.
  • Figure 2: The molecules with their interpretable subgraphs discovered by different methods. These subgraphs exhibit similar chemical properties compared to the molecules on the left.
  • Figure 3: We use the bi-level objective to minimize the mutual information of two distributions. The MI is consistent with the loss as $\mathcal{L}_{MI}$ declines.
  • Figure 4: The histgram of absolute bias between the property of graphs and subgraphs.
  • Figure 5: The molecules with its interpretation found by GIB. These subgraphs exhibit similar chemical properties compared to the molecules on the left.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Definition 4.1: Graph Information Bottleneck
  • Definition 4.2: IB-subgraph