HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning
Zhuo Xu, Lu Bai, Lixin Cui, Ming Li, Yue Wang, Edwin R. Hancock
TL;DR
HC-GAE tackles the dual challenges of universal multi-task graph representations and over-smoothing in GAEs by introducing a hierarchical encoder that partitions graphs into separated subgraphs via hard assignments and compresses them to a coarsened graph, coupled with a decoder that reconstructs via soft assignments. The training objective jointly optimizes a local KL-based loss on subgraphs and a global reconstruction loss, yielding bidirectional, hierarchical graph-level and node-level representations. Empirical results on node and graph classification benchmarks show HC-GAE achieving strong performance and robustness, outperforming several baselines and validating the benefit of hierarchical encoding and the mixed assignment strategy. Overall, HC-GAE offers a scalable, self-supervised framework for multi-task graph representation learning with reduced over-smoothing and improved generalization.
Abstract
Graph Auto-Encoders (GAEs) are powerful tools for graph representation learning. In this paper, we develop a novel Hierarchical Cluster-based GAE (HC-GAE), that can learn effective structural characteristics for graph data analysis. To this end, during the encoding process, we commence by utilizing the hard node assignment to decompose a sample graph into a family of separated subgraphs. We compress each subgraph into a coarsened node, transforming the original graph into a coarsened graph. On the other hand, during the decoding process, we adopt the soft node assignment to reconstruct the original graph structure by expanding the coarsened nodes. By hierarchically performing the above compressing procedure during the decoding process as well as the expanding procedure during the decoding process, the proposed HC-GAE can effectively extract bidirectionally hierarchical structural features of the original sample graph. Furthermore, we re-design the loss function that can integrate the information from either the encoder or the decoder. Since the associated graph convolution operation of the proposed HC-GAE is restricted in each individual separated subgraph and cannot propagate the node information between different subgraphs, the proposed HC-GAE can significantly reduce the over-smoothing problem arising in the classical convolution-based GAEs. The proposed HC-GAE can generate effective representations for either node classification or graph classification, and the experiments demonstrate the effectiveness on real-world datasets.
