Geometric Pooling: maintaining more useful information

Hao Xu; Jia Liu; Yang Shen; Kenan Lou; Yanxia Bao; Ruihua Zhang; Shuyue Zhou; Hongsen Zhao; Shuai Wang

Geometric Pooling: maintaining more useful information

Hao Xu, Jia Liu, Yang Shen, Kenan Lou, Yanxia Bao, Ruihua Zhang, Shuyue Zhou, Hongsen Zhao, Shuai Wang

TL;DR

Geometric Pooling (GP) targets information loss in traditional global graph pooling by selecting node features based on inter-node similarity rather than magnitude, preserving informative negative-valued features. Framed as an entropy-reduction regularization, GP can be integrated with a DGCNN-like backbone (GP-mixed combines sorting with GP for further gains) and achieves state-of-the-art or near-state-of-the-art accuracy on TU graph datasets with fewer parameters. The method leverages Euclidean-distance-based similarity across multi-layer node representations to keep diverse, representative nodes, and demonstrates improved generalization through distribution drag. Overall, GP provides a scalable, information-preserving alternative to sorting-based pooling with demonstrated practical impact on graph classification tasks.

Abstract

Graph Pooling technology plays an important role in graph node classification tasks. Sorting pooling technologies maintain large-value units for pooling graphs of varying sizes. However, by analyzing the statistical characteristic of activated units after pooling, we found that a large number of units dropped by sorting pooling are negative-value units that contain useful information and can contribute considerably to the final decision. To maintain more useful information, a novel pooling technology, called Geometric Pooling (GP), was proposed to contain the unique node features with negative values by measuring the similarity of all node features. We reveal the effectiveness of GP from the entropy reduction view. The experiments were conducted on TUdatasets to show the effectiveness of GP. The results showed that the proposed GP outperforms the SOTA graph pooling technologies by 1%\sim5% with fewer parameters.

Geometric Pooling: maintaining more useful information

TL;DR

Abstract

Paper Structure (15 sections, 8 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 15 sections, 8 equations, 4 figures, 5 tables, 1 algorithm.

Introduction
Related Works
Methods
Set Up
Analysis of the Number Distribution
Geometric Pooling
Entropy Reduction: distribution drag
Experiments
Set Up
Comparison with deep approaches
Comparison with kernel-based approaches
Comparison on the computation cost
Limits and Conclusion
Activation Function Replacement
Ablation Study

Figures (4)

Figure 1: We show the main difference between sorting pooling and the proposed GP. Different graphs containing different-number nodes are marked in different colors. The numbers in different nodes represent the amplitude-sorting indices. For example, the amplitude of the node indexed by 0 is larger than that of indexed by 1. Meanwhile, the color contrast among different nodes in a particular graph means the similarity among different nodes. If three nodes are retained, Sorting Pooling keeps the three nodes with indices 0, 1, and 2, while GP retains the three nodes with the lowest similarity.
Figure 2: The counting histograms of sorting pooling and GP technologies. In (a), the units are dropped by sorting pooling; in (b), the units are dropped by geometric distance between the node features.
Figure 3: We explain the effectiveness of GP from the output distribution drag view. Typically, the common cross entropy loss singly encourages the output distribution close to the distribution of the ground truth in a one-hot manner. This results in an over-fitting problem. By GP, the output distribution is encouraged to be close to the uniform distribution, i.e., it is punished to be far away from the ground truth. $\hat{y}=v(\mathbf{B}(x))$ represents the output of GNNs, where $\mathbf{B}$ denotes the feature extractor of GNNs and $v$ denotes the final linear layer. $y \sim \mathbf{U}(a, b)$ denotes samples over a uniform distribution between $(a, b)$.
Figure 4: Parameter comparison. We give the parameters of different deep approaches on two datasets: D&D and PROTEINS. For example, let GP/DGCNNs (D) denote the parameter amount of the proposed GP and DGCNNs on the D&D dataset. Also, SAGPooling (P) denotes the parameter amount of SAGPooling on PROTEINS dataset. The numbers on the colored bars mean the exact parameter number.

Geometric Pooling: maintaining more useful information

TL;DR

Abstract

Geometric Pooling: maintaining more useful information

Authors

TL;DR

Abstract

Table of Contents

Figures (4)