Graph Vertex Embeddings: Distance, Regularization and Community Detection
Radosław Nowak, Adam Małkowski, Daniel Cieślak, Piotr Sokół, Paweł Wawrzyński
TL;DR
This work addresses the problem of representing graph-structured data with low-dimensional, distance-preserving embeddings. It combines optimization-based embeddings with neural network regularization by modeling vertex embeddings as a neural transformation of the distance matrix columns, enabling flexible distance metrics $d(e_i,e_j)=||e_i-e_j||^{\kappa}$ with $\kappa\in[0,2]$ and learning $\kappa$ jointly. The approach minimizes a discrepancy loss between embedding and graph distances, evaluating both absolute and relative losses, and demonstrates improvements, especially at low embedding dimensions. Experiments on diverse benchmarks show competitive distance preservation and modularity-based community detection, with notable strength on the Zachary Karate Club graph. The results suggest a scalable, expressive embedding framework that supports downstream clustering and graph analysis tasks.
Abstract
Graph embeddings have emerged as a powerful tool for representing complex network structures in a low-dimensional space, enabling the use of efficient methods that employ the metric structure in the embedding space as a proxy for the topological structure of the data. In this paper, we explore several aspects that affect the quality of a vertex embedding of graph-structured data. To this effect, we first present a family of flexible distance functions that faithfully capture the topological distance between different vertices. Secondly, we analyze vertex embeddings as resulting from a fitted transformation of the distance matrix rather than as a direct result of optimization. Finally, we evaluate the effectiveness of our proposed embedding constructions by performing community detection on a host of benchmark datasets. The reported results are competitive with classical algorithms that operate on the entire graph while benefitting from a substantially reduced computational complexity due to the reduced dimensionality of the representations.
