Table of Contents
Fetching ...

Weighted Embeddings for Low-Dimensional Graph Representation

Thomas Bläsius, Jean-Pierre von der Heydt, Maximilian Katzmann, Nikolai Maas

TL;DR

This work provides the embedding algorithm WEmbed and demonstrates that its weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions.

Abstract

Learning low-dimensional numerical representations from symbolic data, e.g., embedding the nodes of a graph into a geometric space, is an important concept in machine learning. While embedding into Euclidean space is common, recent observations indicate that hyperbolic geometry is better suited to represent hierarchical information and heterogeneous data (e.g., graphs with a scale-free degree distribution). Despite their potential for more accurate representations, hyperbolic embeddings also have downsides like being more difficult to compute and harder to use in downstream tasks. We propose embedding into a weighted space, which is closely related to hyperbolic geometry but mathematically simpler. We provide the embedding algorithm WEmbed and demonstrate, based on generated as well as over 2000 real-world graphs, that our weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions. The running time of WEmbed and embedding quality for the remaining instances is on par with state-of-the-art Euclidean embedders.

Weighted Embeddings for Low-Dimensional Graph Representation

TL;DR

This work provides the embedding algorithm WEmbed and demonstrates that its weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions.

Abstract

Learning low-dimensional numerical representations from symbolic data, e.g., embedding the nodes of a graph into a geometric space, is an important concept in machine learning. While embedding into Euclidean space is common, recent observations indicate that hyperbolic geometry is better suited to represent hierarchical information and heterogeneous data (e.g., graphs with a scale-free degree distribution). Despite their potential for more accurate representations, hyperbolic embeddings also have downsides like being more difficult to compute and harder to use in downstream tasks. We propose embedding into a weighted space, which is closely related to hyperbolic geometry but mathematically simpler. We provide the embedding algorithm WEmbed and demonstrate, based on generated as well as over 2000 real-world graphs, that our weighted embeddings heavily outperform state-of-the-art Euclidean embeddings for heterogeneous graphs while using fewer dimensions. The running time of WEmbed and embedding quality for the remaining instances is on par with state-of-the-art Euclidean embedders.
Paper Structure (17 sections, 3 equations, 5 figures, 2 tables)

This paper contains 17 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A 2-dimensional weighted embedding of the internet graph used by Sustain_Inter-Bogun10. The node size indicates the weight.
  • Figure 2: Different loss functions and resulting forces. Green lines correspond to $\mathcal{L}_{\text{rep}}$ and $f_{\mathrm{rep}}$, blue lines to $\mathcal{L}_{\mathrm{attr}}$ and $f_{\mathrm{attr}}$ and orange to their sum. The threshold $\ell$ is marked in gray.
  • Figure 3: Embedding quality for the real data set. Each line represents the CDF of the F1-score (note the reversed $x$-axis) for an embedder in a fixed dimension. Dimensions for the different embedders increase in powers of $2$: $4, 8, 16, 32$ for WEmbed and additionally $64, 128$ for the others.
  • Figure 4: Comparing the quality for $8$-dimensional embeddings for the real data set. Each point is one graph with the F1-score achieved by WEmbed on the $y$-axis and the F1-score by Euclidean approaches on the $x$-axis. The color indicates the heterogeneity of the degree distribution. Euclidean (first column) is WEmbed but with uniform weight.
  • Figure 5: Embedding quality for GIRGs with different power-law exponents. Euclidean refers to WEmbed with uniform weights.