How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings
Nikolaos Nakis, Niels Raunkjær Holm, Andreas Lyhne Fiehn, Morten Mørup
TL;DR
The paper investigates how few dimensions suffice to exactly reconstruct complex networks, showing that Euclidean metric embeddings using the latent distance model can match or beat LPCA in embedding efficiency. It introduces a binary-search procedure to bound the exact embedding dimension D^*, and a KD-tree–based, linearithmic reconstruction check to scale to large graphs, complemented by a hierarchical block distance model (HBDM) for scalable initialization. Theoretical result D^*_{LPCA}-2 ≤ D^*_{L2} ≤ D^*_{LPCA} and extensive experiments on datasets from small to million-node graphs demonstrate substantially lower embedding dimensions than prior bounds, including successful exact reconstructions for large networks. These findings enable highly compact, lossless graph representations with broad implications for visualization, community detection, node classification, and link prediction, while offering scalable and reproducible methodology."
Abstract
Low-dimensional embeddings are essential for machine learning tasks involving graphs, such as node classification, link prediction, community detection, network visualization, and network compression. Although recent studies have identified exact low-dimensional embeddings, the limits of the required embedding dimensions remain unclear. We presently prove that lower dimensional embeddings are possible when using Euclidean metric embeddings as opposed to vector-based Logistic PCA (LPCA) embeddings. In particular, we provide an efficient logarithmic search procedure for identifying the exact embedding dimension and demonstrate how metric embeddings enable inference of the exact embedding dimensions of large-scale networks by exploiting that the metric properties can be used to provide linearithmic scaling. Empirically, we show that our approach extracts substantially lower dimensional representations of networks than previously reported for small-sized networks. For the first time, we demonstrate that even large-scale networks can be effectively embedded in very low-dimensional spaces, and provide examples of scalable, exact reconstruction for graphs with up to a million nodes. Our approach highlights that the intrinsic dimensionality of networks is substantially lower than previously reported and provides a computationally efficient assessment of the exact embedding dimension also of large-scale networks. The surprisingly low dimensional representations achieved demonstrate that networks in general can be losslessly represented using very low dimensional feature spaces, which can be used to guide existing network analysis tasks from community detection and node classification to structure revealing exact network visualizations.
