Table of Contents
Fetching ...

DeepHGCN: Toward Deeper Hyperbolic Graph Convolutional Networks

Jiaxu Liu, Xinping Yi, Xiaowei Huang

TL;DR

DeepHGCN tackles the depth limitation of hyperbolic graph networks by introducing a scalable hyperbolic feature transformation and a suite of residual and regularization techniques to preserve expressive power across deep stacks. It combines a fast hyperbolic backbone based on the Poincaré ball with Möbius gyromidpoint aggregation and Dirichlet-energy-guided training. Empirical results across diverse datasets show substantial gains in link prediction and node classification over Euclidean GCNs and shallow hyperbolic models, with deeper architectures offering increasing benefits. The work suggests a practical pathway to deep hyperbolic graph learning and points to future directions in mixed-curvature and broader manifold contexts.

Abstract

Hyperbolic graph convolutional networks (HGCNs) have demonstrated significant potential in extracting information from hierarchical graphs. However, existing HGCNs are limited to shallow architectures due to the computational expense of hyperbolic operations and the issue of over-smoothing as depth increases. Although treatments have been applied to alleviate over-smoothing in GCNs, developing a hyperbolic solution presents distinct challenges since operations must be carefully designed to fit the hyperbolic nature. Addressing these challenges, we propose DeepHGCN, the first deep multi-layer HGCN architecture with dramatically improved computational efficiency and substantially reduced over-smoothing. DeepHGCN features two key innovations: (1) a novel hyperbolic feature transformation layer that enables fast and accurate linear mappings, and (2) techniques such as hyperbolic residual connections and regularization for both weights and features, facilitated by an efficient hyperbolic midpoint method. Extensive experiments demonstrate that DeepHGCN achieves significant improvements in link prediction and node classification tasks compared to both Euclidean and shallow hyperbolic GCN variants.

DeepHGCN: Toward Deeper Hyperbolic Graph Convolutional Networks

TL;DR

DeepHGCN tackles the depth limitation of hyperbolic graph networks by introducing a scalable hyperbolic feature transformation and a suite of residual and regularization techniques to preserve expressive power across deep stacks. It combines a fast hyperbolic backbone based on the Poincaré ball with Möbius gyromidpoint aggregation and Dirichlet-energy-guided training. Empirical results across diverse datasets show substantial gains in link prediction and node classification over Euclidean GCNs and shallow hyperbolic models, with deeper architectures offering increasing benefits. The work suggests a practical pathway to deep hyperbolic graph learning and points to future directions in mixed-curvature and broader manifold contexts.

Abstract

Hyperbolic graph convolutional networks (HGCNs) have demonstrated significant potential in extracting information from hierarchical graphs. However, existing HGCNs are limited to shallow architectures due to the computational expense of hyperbolic operations and the issue of over-smoothing as depth increases. Although treatments have been applied to alleviate over-smoothing in GCNs, developing a hyperbolic solution presents distinct challenges since operations must be carefully designed to fit the hyperbolic nature. Addressing these challenges, we propose DeepHGCN, the first deep multi-layer HGCN architecture with dramatically improved computational efficiency and substantially reduced over-smoothing. DeepHGCN features two key innovations: (1) a novel hyperbolic feature transformation layer that enables fast and accurate linear mappings, and (2) techniques such as hyperbolic residual connections and regularization for both weights and features, facilitated by an efficient hyperbolic midpoint method. Extensive experiments demonstrate that DeepHGCN achieves significant improvements in link prediction and node classification tasks compared to both Euclidean and shallow hyperbolic GCN variants.
Paper Structure (47 sections, 4 theorems, 35 equations, 11 figures, 9 tables, 1 algorithm)

This paper contains 47 sections, 4 theorems, 35 equations, 11 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

Given $\mathbf{h}^{} \in \mathbb{D}^{d_1}_\kappa$, Euclidean weight and bias parameter ${\mathbf{W}} \in \mathbb{R}^{d_2 \times d_1}$ and $\mathbf{b}_1, \mathbf{b}_2\in\mathbb{R}^{d_2}$. A more computational-efficient and expressive feature transformation $\mathcal{F}^\kappa_{\mathbb{D}}: \mathbb{D} where $\phi(\cdot)$ is formulated as

Figures (11)

  • Figure 1: Left: Two prevalent hyperbolic models, isometric projection through the red line, where $\operatorname{P}_{\mathbb{D}\to\mathbb{L}}$: $\textcolor{red}{\bullet}\to \textcolor{blue}{\bullet}$ and $\operatorname{P}_{\mathbb{L}\to\mathbb{D}}$: $\textcolor{blue}{\bullet}\to \textcolor{red}{\bullet}$; Right: Performance over training time on Airport in 5k epochs. DeepHGCN(2-layer) outperforms existing hyperbolic models and is more efficient. Increasing depth to DeepHGCN(8) bring further improvements.
  • Figure 2: Decision hyperplane of various feature transformations on synthetic binary classification tasks (Task #1: (a)-(d)) and Task #2: (e)-(h)).
  • Figure 3: Compare tangential midpoint chami2019hyperbolic, Möbius gyromidpoint ungar2008gyrovector and differentiable Fréchet mean Lou2020DifferentiatingTT in the 2-dimension Poincaré disk. For each method we illustrate the weighted midpoint (blue) for double and multiple randomly sampled points (red) with randomly initialized weights.
  • Figure 4: Comparison between the existing HGCN architecture and the proposed DeepHGCN. At the $l$-th layer, (a) performs linear transformation directly after the aggregation and regards the transformed feature as next layer's input, causing over-smoothing as $l$ increases; (b) performs hyperbolic residual connection after aggregation and linear layer to alleviate over-smoothing, such that the hyperbolic residual operator retains the feature on the manifold and the global hyperbolic geometry is preserved.
  • Figure 5: Averaged performance of different models with various numbers of layers. We include hyperbolic, homophilic, and heterophilic datasets. The deeper models that overcome over-smoothing generally perform better.
  • ...and 6 more figures

Theorems & Definitions (8)

  • Theorem 1
  • Proposition 2
  • Proposition 3
  • proof
  • Definition 1: Möbius gyromidpoint ungar2008gyrovector
  • Definition 2
  • Proposition 4
  • proof