Table of Contents
Fetching ...

Transfer Entropy in Graph Convolutional Neural Networks

Adrian Moldovan, Angel Caţaron, Răzvan Andonie

TL;DR

Problem: GCNs suffer oversmoothing during deep message passing and can underperform on graphs with mixed homophily and heterophily. Approach: embed a Transfer Entropy based control mechanism (TE-GGCN) that selects high-heterophily, high-degree nodes and adjusts their features after convolution blocks using TE values. Contributions: shows TE-GGCN can improve accuracy across diverse datasets compared to GGCN, with a transparent trade-off where TE introduces computational overhead. Significance: offers a practical, plug-in method to boost GCN performance on heterogeneous graphs with minimal architectural changes.

Abstract

Graph Convolutional Networks (GCN) are Graph Neural Networks where the convolutions are applied over a graph. In contrast to Convolutional Neural Networks, GCN's are designed to perform inference on graphs, where the number of nodes can vary, and the nodes are unordered. In this study, we address two important challenges related to GCNs: i) oversmoothing; and ii) the utilization of node relational properties (i.e., heterophily and homophily). Oversmoothing is the degradation of the discriminative capacity of nodes as a result of repeated aggregations. Heterophily is the tendency for nodes of different classes to connect, whereas homophily is the tendency of similar nodes to connect. We propose a new strategy for addressing these challenges in GCNs based on Transfer Entropy (TE), which measures of the amount of directed transfer of information between two time varying nodes. Our findings indicate that using node heterophily and degree information as a node selection mechanism, along with feature-based TE calculations, enhances accuracy across various GCN models. Our model can be easily modified to improve classification accuracy of a GCN model. As a trade off, this performance boost comes with a significant computational overhead when the TE is computed for many graph nodes.

Transfer Entropy in Graph Convolutional Neural Networks

TL;DR

Problem: GCNs suffer oversmoothing during deep message passing and can underperform on graphs with mixed homophily and heterophily. Approach: embed a Transfer Entropy based control mechanism (TE-GGCN) that selects high-heterophily, high-degree nodes and adjusts their features after convolution blocks using TE values. Contributions: shows TE-GGCN can improve accuracy across diverse datasets compared to GGCN, with a transparent trade-off where TE introduces computational overhead. Significance: offers a practical, plug-in method to boost GCN performance on heterogeneous graphs with minimal architectural changes.

Abstract

Graph Convolutional Networks (GCN) are Graph Neural Networks where the convolutions are applied over a graph. In contrast to Convolutional Neural Networks, GCN's are designed to perform inference on graphs, where the number of nodes can vary, and the nodes are unordered. In this study, we address two important challenges related to GCNs: i) oversmoothing; and ii) the utilization of node relational properties (i.e., heterophily and homophily). Oversmoothing is the degradation of the discriminative capacity of nodes as a result of repeated aggregations. Heterophily is the tendency for nodes of different classes to connect, whereas homophily is the tendency of similar nodes to connect. We propose a new strategy for addressing these challenges in GCNs based on Transfer Entropy (TE), which measures of the amount of directed transfer of information between two time varying nodes. Our findings indicate that using node heterophily and degree information as a node selection mechanism, along with feature-based TE calculations, enhances accuracy across various GCN models. Our model can be easily modified to improve classification accuracy of a GCN model. As a trade off, this performance boost comes with a significant computational overhead when the TE is computed for many graph nodes.
Paper Structure (11 sections, 7 equations, 2 figures, 1 table)

This paper contains 11 sections, 7 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Illustration of a node's features being aggregated through a two layer convolution in a GCN (graphic from yoon2022). The node A from 2nd layer will receive updated aggregated values from all its neighbors having a depth equal with the number of convolutional layers.
  • Figure 2: Graph representation of selected datasets from different types of structure. Clustering was obtained using the Louvain method blondel2008 and colors do not represent the actual classes in the dataset. First, the Louvain algorithm converts the graph data into a k-nearest neighbor graph. Then, the degree (number of edges) from within a cluster is compared with the degree to the exterior of the cluster, hence computing weights for the edges between clusters. Cora corresponds to highly-homophilic graphs, Chameleon has both homophilic and heterophilic subgraphs, and Wisconsin presents heterophilic properties. It can be observed that Cora contains multiple disconnected subgraphs, whereas for Chameleon all subgraphs present low to high connectivity between the various classes.