Table of Contents
Fetching ...

GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction

Shijin Duan, Ruyi Ding, Jiaxing He, Aidong Adam Ding, Yunsi Fei, Xiaolin Xu

TL;DR

This work introduces a cross-correlation mechanism that significantly enhances the GAE representational capabilities and proposes GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks and ensures robust structural reconstruction, through a mirrored encoding-decoding process.

Abstract

Graph-structured data is integral to many applications, prompting the development of various graph representation methods. Graph autoencoders (GAEs), in particular, reconstruct graph structures from node embeddings. Current GAE models primarily utilize self-correlation to represent graph structures and focus on node-level tasks, often overlooking multi-graph scenarios. Our theoretical analysis indicates that self-correlation generally falls short in accurately representing specific graph features such as islands, symmetrical structures, and directional edges, particularly in smaller or multiple graph contexts. To address these limitations, we introduce a cross-correlation mechanism that significantly enhances the GAE representational capabilities. Additionally, we propose GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks and ensures robust structural reconstruction, through a mirrored encoding-decoding process. This model also tackles the challenge of representation bias during optimization by implementing a loss-balancing strategy. Both theoretical analysis and numerical evaluations demonstrate that our methodology significantly outperforms existing self-correlation-based GAEs in graph structure reconstruction.

GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction

TL;DR

This work introduces a cross-correlation mechanism that significantly enhances the GAE representational capabilities and proposes GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks and ensures robust structural reconstruction, through a mirrored encoding-decoding process.

Abstract

Graph-structured data is integral to many applications, prompting the development of various graph representation methods. Graph autoencoders (GAEs), in particular, reconstruct graph structures from node embeddings. Current GAE models primarily utilize self-correlation to represent graph structures and focus on node-level tasks, often overlooking multi-graph scenarios. Our theoretical analysis indicates that self-correlation generally falls short in accurately representing specific graph features such as islands, symmetrical structures, and directional edges, particularly in smaller or multiple graph contexts. To address these limitations, we introduce a cross-correlation mechanism that significantly enhances the GAE representational capabilities. Additionally, we propose GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks and ensures robust structural reconstruction, through a mirrored encoding-decoding process. This model also tackles the challenge of representation bias during optimization by implementing a loss-balancing strategy. Both theoretical analysis and numerical evaluations demonstrate that our methodology significantly outperforms existing self-correlation-based GAEs in graph structure reconstruction.
Paper Structure (39 sections, 14 equations, 18 figures, 7 tables)

This paper contains 39 sections, 14 equations, 18 figures, 7 tables.

Figures (18)

  • Figure 1: Two examples of the topological symmetric graphs. The left graph is axis-symmetric; the right graph is centrosymmetric.
  • Figure 1: The AUC score of reconstructing the adjacency matrix in graph tasks. We reproduce the most representative global GAE methods with different decoding strategies. The self-correlation methods include naïve GAE, variational GAE kipf2016variational, L2-norm (EGNN) satorras2021egnn, and our GraphCroc under self-correlation; the cross-correlation methods include directed representation (DiGAE) kollias2022directed and our GraphCroc. The best results are in bold, and the second bests are underlined.
  • Figure 2: Training comparison between self-correlation and cross-correlation on PROTEINS subset (64 graphs). In (a) and (b), we demonstrate the trajectory of the first two node embeddings in the first graph during training iteration, where the star mark is the end-point of training. We apply PCA for dimension compression and the Savitzky-Golay filter to help trace visualization. We also set $z_i=p_i\neq q_i$ at the beginning of optimization to ensure that the traces of $z_i$ in (a) and $p_i$ in (b) start from the same point. (c) provides the BCE loss trace of this graph during training, showing that cross-correlation can lead the reconstruction to a better solution. (d) demonstrates the distribution of diagonal elements during training, i.e., $z_i^T z_i$ for self-correlation and $p_i^T q_i$ for cross-correlation. The results of other graphs are provided in Appendix \ref{['app:trajectory']}.
  • Figure 3: GraphCroc architecture. The encoder is generally demonstrated as a $L+1$-layer GNN. The decoder has two paths to generate the node embedding for cross-correlation; each decoder is a mirrored structure of the encoder. Each decoder layer accepts the node feature and graph structure information from the corresponding encoder layer. Notably, the GCN module shown on the right incorporates skip connections and normalization to improve performance.
  • Figure 4: WL-test results on different GAE methods, in the IMDB-B task.
  • ...and 13 more figures