Zoo Guide to Network Embedding

Anthony Baptista; Rubén J. Sánchez-García; Anaïs Baudot; Ginestra Bianconi

Zoo Guide to Network Embedding

Anthony Baptista, Rubén J. Sánchez-García, Anaïs Baudot, Ginestra Bianconi

TL;DR

Network embedding maps nodes to a latent space, typically $X=\mathbb{R}^d$, to preserve structural relations for visualization and downstream inference. The paper presents a flexible, math-driven taxonomy that partitions methods into shallow (matrix factorisation, random-walk, optimisation), deep-learning (conventional NN, GNNs, graph generators), higher-order, and emerging approaches (hyperbolic/Lorentzian spaces, magnetic/connection Laplacians), anchored by encoder–decoder formulations. It surveys canonical techniques (e.g., Laplacian Eigenmaps, DeepWalk, LINE, VERSE) and advanced directions (hypergraphs/simplicial complexes, hyperbolic embeddings, magnetic/connection Laplacians), linking them to diverse applications such as knowledge graphs and biology. Finally, it offers guidance on method selection, evaluation, and practical workflows to aid researchers and practitioners navigating this rapidly evolving field.

Abstract

Networks have provided extremely successful models of data and complex systems. Yet, as combinatorial objects, networks do not have in general intrinsic coordinates and do not typically lie in an ambient space. The process of assigning an embedding space to a network has attracted lots of interest in the past few decades, and has been efficiently applied to fundamental problems in network inference, such as link prediction, node classification, and community detection. In this review, we provide a user-friendly guide to the network embedding literature and current trends in this field which will allow the reader to navigate through the complex landscape of methods and approaches emerging from the vibrant research activity on these subjects.

Zoo Guide to Network Embedding

TL;DR

Network embedding maps nodes to a latent space, typically

, to preserve structural relations for visualization and downstream inference. The paper presents a flexible, math-driven taxonomy that partitions methods into shallow (matrix factorisation, random-walk, optimisation), deep-learning (conventional NN, GNNs, graph generators), higher-order, and emerging approaches (hyperbolic/Lorentzian spaces, magnetic/connection Laplacians), anchored by encoder–decoder formulations. It surveys canonical techniques (e.g., Laplacian Eigenmaps, DeepWalk, LINE, VERSE) and advanced directions (hypergraphs/simplicial complexes, hyperbolic embeddings, magnetic/connection Laplacians), linking them to diverse applications such as knowledge graphs and biology. Finally, it offers guidance on method selection, evaluation, and practical workflows to aid researchers and practitioners navigating this rapidly evolving field.

Abstract

Paper Structure (19 sections, 3 equations, 3 figures, 2 tables)

This paper contains 19 sections, 3 equations, 3 figures, 2 tables.

Definitions and preliminaries
Existing Taxonomies of network embedding methods
Taxonomy of network embedding methods
Shallow network embedding methods
Matrix factorisation methods
Random walk methods
Optimisation methods
Deep learning methods
Conventional neural networks
Graph Neural Networks (GNN)
Graph generative methods
Higher-order network methods
Emerging methods
Network embedding in hyperbolic and Lorentzian spaces
Network embeddings using Magnetic and Connection Laplacians
...and 4 more sections

Figures (3)

Figure 1: Pie charts describing the new taxonomy defined in this manuscript. In the top pie chart, the methods are divided into two main categories: shallow embedding methods and the deep learning methods, complemented by higher-order methods that can be either a shallow embedding or a deep learning methods. The bottom pie chart highlights the three major emerging groups of methods. Notably, these emerging groups of methods can be classified into our defined taxonomy due to its flexibility.
Figure 2: Shallow network embedding: To perform shallow network embedding, a network is projected into a low-dimensional vector space, such as a 2-dimensional embedding space. This projection is achieved using a mapping function $f$ that enables the mapping from the direct space to the embedding space. The mapping function $f$ is derived by optimizing a loss function $\mathcal{L}$, which aims to minimize the difference between the similarity measures of nodes in the direct space ($S_{D}$) and their equivalents in the embedded space ($S_{E}$) obtained through the decoder function.
Figure 3: Workflow for choosing a network embedding method. The application of network embedding starts with some preliminary questions (top box). Depending on the answers, and the task to be performed, different network embedding methods can be applied. Here, we list the most common methods associated with the most common network embedding tasks. Once the embedding representation has been obtained, the workflow could further perform some 'sanity checks' to measure the efficiency of the network embedding setup (bottom box). Depending on the results of these sanity checks, we can either stop the development or go back to the preliminary questions in order to improve the workflow. These improvements are usually based on adding complementary information, tuning the parameters, or redefining the properties to be preserved by the embedding representation.

Zoo Guide to Network Embedding

TL;DR

Abstract

Zoo Guide to Network Embedding

Authors

TL;DR

Abstract

Table of Contents

Figures (3)