Table of Contents
Fetching ...

Improving Hyperbolic Representations via Gromov-Wasserstein Regularization

Yifei Yang, Wonjun Lee, Dongmian Zou, Gilad Lerman

TL;DR

This work tackles the challenge of preserving the intrinsic geometric structure when embedding data into hyperbolic space. It introduces a GW-based regularization framework, leveraging the Gromov-Monge formulation to align pairwise costs between Euclidean inputs and hyperbolic embeddings via a transport map induced by hyperbolic neural networks. Theoretical guarantees connect empirical GM costs to population costs under a bi-Lipschitz assumption, and the approach maintains favorable computational properties. Empirically, the GW regularization yields consistent improvements on few-shot image classification and semi-supervised graph tasks while incurring only modest overhead, highlighting its practical value for geometry-aware hyperbolic learning.

Abstract

Hyperbolic representations have shown remarkable efficacy in modeling inherent hierarchies and complexities within data structures. Hyperbolic neural networks have been commonly applied for learning such representations from data, but they often fall short in preserving the geometric structures of the original feature spaces. In response to this challenge, our work applies the Gromov-Wasserstein (GW) distance as a novel regularization mechanism within hyperbolic neural networks. The GW distance quantifies how well the original data structure is maintained after embedding the data in a hyperbolic space. Specifically, we explicitly treat the layers of the hyperbolic neural networks as a transport map and calculate the GW distance accordingly. We validate that the GW distance computed based on a training set well approximates the GW distance of the underlying data distribution. Our approach demonstrates consistent enhancements over current state-of-the-art methods across various tasks, including few-shot image classification, as well as semi-supervised graph link prediction and node classification.

Improving Hyperbolic Representations via Gromov-Wasserstein Regularization

TL;DR

This work tackles the challenge of preserving the intrinsic geometric structure when embedding data into hyperbolic space. It introduces a GW-based regularization framework, leveraging the Gromov-Monge formulation to align pairwise costs between Euclidean inputs and hyperbolic embeddings via a transport map induced by hyperbolic neural networks. Theoretical guarantees connect empirical GM costs to population costs under a bi-Lipschitz assumption, and the approach maintains favorable computational properties. Empirically, the GW regularization yields consistent improvements on few-shot image classification and semi-supervised graph tasks while incurring only modest overhead, highlighting its practical value for geometry-aware hyperbolic learning.

Abstract

Hyperbolic representations have shown remarkable efficacy in modeling inherent hierarchies and complexities within data structures. Hyperbolic neural networks have been commonly applied for learning such representations from data, but they often fall short in preserving the geometric structures of the original feature spaces. In response to this challenge, our work applies the Gromov-Wasserstein (GW) distance as a novel regularization mechanism within hyperbolic neural networks. The GW distance quantifies how well the original data structure is maintained after embedding the data in a hyperbolic space. Specifically, we explicitly treat the layers of the hyperbolic neural networks as a transport map and calculate the GW distance accordingly. We validate that the GW distance computed based on a training set well approximates the GW distance of the underlying data distribution. Our approach demonstrates consistent enhancements over current state-of-the-art methods across various tasks, including few-shot image classification, as well as semi-supervised graph link prediction and node classification.
Paper Structure (34 sections, 1 theorem, 18 equations, 8 figures, 10 tables)

This paper contains 34 sections, 1 theorem, 18 equations, 8 figures, 10 tables.

Key Result

theorem thmcountertheorem

Given cost functions $c_{\mathbb{X}}$ and $c_{\mathbb{H}}$, suppose that there exists a constant $0<\alpha\leq1$ for which $T$ satisfies the following bi-Lipschitz condition: for any ${\mathbf{x}}, {\mathbf{x}}' \in {\mathbb{X}}$. Let $\mu$ be a distribution defined on ${\mathbb{X}}$ and $\{{\mathbf{x}}_i\}_{i=1}^m$ be i.i.d. sampled from $\mu$. Then, holds with probability at least $1 - 2\exp\l

Figures (8)

  • Figure 1: Illustration of hierarchical structures in image data. The images are taken from the MiniImagenet dataset.
  • Figure 2: Illustration of the transport map $T$. The feature distribution $\mu$ defined on ${\mathbb{X}}$ is pushed forward to $\nu = T _\# \mu$ on ${\mathbb{H}}$.
  • Figure 3: The framework of our GW regularization.
  • Figure 4: The framework of the GW regularized hyperbolic ProtoNet model.
  • Figure 5: Sensitivity analysis of HNN.
  • ...and 3 more figures

Theorems & Definitions (2)

  • theorem thmcountertheorem
  • proof