Table of Contents
Fetching ...

Information-Theoretic Limits of Node Localization under Hybrid Graph Positional Encodings

Zimo Yan, Zheng Xie, Chang Liu, Yiqin Lv, Runfan Duan

Abstract

Positional encoding has become a standard component in graph learning, especially for graph Transformers and other models that must distinguish structurally similar nodes, yet its fundamental identifiability remains poorly understood. In this work, we study node localization under a hybrid positional encoding that combines anchor-distance profiles with quantized low-frequency spectral features. We cast localization as an observation-map problem whose difficulty is controlled by the number of distinct codes induced by the encoding and establish an information-theoretic converse identifying an impossibility regime jointly governed by the anchor number, spectral dimension, and quantization level. Experiments further support this picture: on random $3$-regular graphs, the empirical crossover is well organized by the predicted scaling, while on two real-world DDI graphs identifiability is strongly graph-dependent, with DrugBank remaining highly redundant under the tested encodings and the Decagon-derived graph becoming nearly injective under sufficiently rich spectral information. Overall, these results suggest that positional encoding should be understood not merely as a heuristic architectural component, but as a graph-dependent structural resolution mechanism.

Information-Theoretic Limits of Node Localization under Hybrid Graph Positional Encodings

Abstract

Positional encoding has become a standard component in graph learning, especially for graph Transformers and other models that must distinguish structurally similar nodes, yet its fundamental identifiability remains poorly understood. In this work, we study node localization under a hybrid positional encoding that combines anchor-distance profiles with quantized low-frequency spectral features. We cast localization as an observation-map problem whose difficulty is controlled by the number of distinct codes induced by the encoding and establish an information-theoretic converse identifying an impossibility regime jointly governed by the anchor number, spectral dimension, and quantization level. Experiments further support this picture: on random -regular graphs, the empirical crossover is well organized by the predicted scaling, while on two real-world DDI graphs identifiability is strongly graph-dependent, with DrugBank remaining highly redundant under the tested encodings and the Decagon-derived graph becoming nearly injective under sufficiently rich spectral information. Overall, these results suggest that positional encoding should be understood not merely as a heuristic architectural component, but as a graph-dependent structural resolution mechanism.

Paper Structure

This paper contains 49 sections, 17 theorems, 225 equations, 4 figures, 9 tables.

Key Result

Lemma 1

For any fixed $(G,\mathcal{A})$, let Then for any measurable map one has Moreover, The upper bound is attained by any measurable section satisfying Consequently,

Figures (4)

  • Figure 1: Phase-transition-like behavior on random $3$-regular graphs. (a) Mean localization error versus the number of anchors $k$. (b) Mean average preimage size versus the budget ratio $\rho_{\mathrm{eng}}$, showing the collapse of observation-map fibers in the low-error regime. (c) Error curves plotted against $\rho_{\mathrm{eng}}$ for spectral dimensions $m\in\{0,1,2,5\}$. Across spectral dimensions, the transition is more consistently organized by the theory-guided budget ratio than by the anchor number alone.
  • Figure 2: Empirical threshold $k_{\mathrm{emp}}$ versus quantization step $\eta$ for random $3$-regular graphs. Each panel fixes $n$, and each curve corresponds to a spectral dimension $m$.
  • Figure 3: Hyperparameter interaction on random $3$-regular graphs at $n=2000$. Top: mean error versus $\eta$ for fixed $k=4$. Bottom: mean error versus $k$ for fixed $m=5$.
  • Figure 4: Bucketwise diagnostics on random $3$-regular graphs. Left: mean error vs. singleton-node fraction. Right: mean error vs. mean within-bucket collision. Highlighted points show an intermediate regime and a near-injective regime.

Theorems & Definitions (25)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Proposition 4
  • Theorem 5
  • Corollary 1
  • Proposition 6
  • Corollary 2
  • Corollary 3
  • Lemma : Lemma \ref{['lem:preimage-identity']}
  • ...and 15 more