Table of Contents
Fetching ...

The Effects of Randomness on the Stability of Node Embeddings

Tobias Schumacher, Hinrikus Wolf, Martin Ritzert, Florian Lemmerich, Jan Bachmann, Florian Frantzen, Max Klabunde, Martin Grohe, Markus Strohmaier

TL;DR

This paper investigates how randomness inherent in state-of-the-art node-embedding algorithms affects stability in both embedding geometry and downstream classification. It evaluates five algorithms (HOPE, LINE, node2vec, SDNE, GraphSAGE) on synthetic and real graphs using three geometric measures (aligned cosine similarity, k-NN Jaccard similarity, second-order cosine similarity) and node-classification performance, revealing substantial geometric instability for most methods except HOPE, while downstream classification accuracy remains largely robust. The study emphasizes that, despite stable overall performance, individual node predictions can differ across embeddings, underscoring reproducibility concerns in embedding-based workflows. These findings motivate the design of stability-aware embeddings and repeated evaluations to ensure reliable deployment, especially in high-stakes or privacy-sensitive applications.

Abstract

We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

The Effects of Randomness on the Stability of Node Embeddings

TL;DR

This paper investigates how randomness inherent in state-of-the-art node-embedding algorithms affects stability in both embedding geometry and downstream classification. It evaluates five algorithms (HOPE, LINE, node2vec, SDNE, GraphSAGE) on synthetic and real graphs using three geometric measures (aligned cosine similarity, k-NN Jaccard similarity, second-order cosine similarity) and node-classification performance, revealing substantial geometric instability for most methods except HOPE, while downstream classification accuracy remains largely robust. The study emphasizes that, despite stable overall performance, individual node predictions can differ across embeddings, underscoring reproducibility concerns in embedding-based workflows. These findings motivate the design of stability-aware embeddings and repeated evaluations to ensure reliable deployment, especially in high-stakes or privacy-sensitive applications.

Abstract

We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

Paper Structure

This paper contains 15 sections, 5 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration. Stability of node embeddings with respect to their (i) geometry and (ii) predictive (classification) power. Left: Zachary's Karate Club Graph zachary1977information Middle: Comparison approach. Right: Geometric and predictive stability, illustrated with a decision tree classifier. Lines are used to visually connect identical nodes between embeddings A and B.
  • Figure 2: Geometric stability. Each letter-value plot shows the node-wise similarity values resulting from 30 runs per algorithm and graph. In (a) we use cosine similarity, in (b) we use the 20-NN Jaccard similarity. In both (a) and (b), HOPE achieves am similarity of 1.0 with hardly any variance, whereas for GraphSAGE similarities are are around 0.1 in (a) and 0 with hardly any variance in (b). In between, LINE, node2vec and SDNE appear moderately stable with respect to aligned cosine similarity. In contrast, we see in (b) that high Jaccard similarities over 0.7 rarely occur.
  • Figure 3: Influence of node centrality. The moving average of the node-wise 20-NN Jaccard similarities resulting from 30 embeddings per graph are plotted against each node's PageRank. The (in-)stability of HOPE and GraphSAGE appears to be invariant to node centrality. For the other algorithms there is no clear correlation between the centrality of a node and the stability of its embeddings.
  • Figure 4: Influence of graph properties. In (a) synthetic graphs with varying size at fixed density 0.01 and in (b) synthetic graphs with varying density and 8000 nodes are used to measure the influence of those graph properties on stability. Each data point represents the average node-wise similarity over all nodes per graph and all 435 embedding pairs resulting from 30 runs of the corresponding algorithm. Except for a negative trend for GraphSAGE, the graph size does not seem to influence stability. Increasing density mostly leads to more stable embeddings for SDNE and node2vec, and has little or no effect on other embedding algorithms.
  • Figure 5: Influence of node distance. Mean absolute deviation of the angles between embedding vectors of distinct nodes over 30 embeddings. Node pairs are categorized into (i) neighboring nodes, (ii) 2-hop neighbors, (iii) more distant nodes. 1000 node pairs were sampled for each category. Except for LINE's embeddings, the variability in angles appears invariant to the distance between the corresponding nodes.
  • ...and 2 more figures