Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

Viet Anh Nguyen; Nhat Khang Ngo; Truong Son Hy

Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

Viet Anh Nguyen, Nhat Khang Ngo, Truong Son Hy

TL;DR

This work tackles data scarcity and domain dependence in graph pretraining by introducing HOPE-WavePE, a range-aware, domain-agnostic pretraining framework that learns to reconstruct high-order adjacencies from multi-resolution spectral graph wavelet signals. It combines a fast, scalable graph wavelet transform with a high-order permutation-equivariant autoencoder to produce node-level structural embeddings that generalize across tasks and domains. The approach is supported by a theoretical result guaranteeing the ability to approximate a weighted sum of multi-hop adjacencies and is validated through extensive experiments on MoleculeNet, LRGB, TU datasets, MNIST/CIFAR-10, and ZINC, showing improved performance and transferability. Overall, HOPE-WavePE provides a principled, scalable pathway toward general graph structure encoders and graph foundation models, with practical impact for molecule property prediction, materials science, and beyond.

Abstract

Unsupervised pre-training on vast amounts of graph data is critical in real-world applications wherein labeled data is limited, such as molecule properties prediction or materials science. Existing approaches pre-train models for specific graph domains, neglecting the inherent connections within networks. This limits their ability to transfer knowledge to various supervised tasks. In this work, we propose a novel pre-training strategy on graphs that focuses on modeling their multi-resolution structural information, allowing us to capture global information of the whole graph while preserving local structures around its nodes. We extend the work of Wave}let Positional Encoding (WavePE) from (Ngo et al., 2023) by pretraining a High-Order Permutation-Equivariant Autoencoder (HOPE-WavePE) to reconstruct node connectivities from their multi-resolution wavelet signals. Unlike existing positional encodings, our method is designed to become sensitivity to the input graph size in downstream tasks, which efficiently capture global structure on graphs. Since our approach relies solely on the graph structure, it is also domain-agnostic and adaptable to datasets from various domains, therefore paving the wave for developing general graph structure encoders and graph foundation models. We theoretically demonstrate that there exists a parametrization of such architecture that it can predict the output adjacency up to arbitrarily low error. We also evaluate HOPE-WavePE on graph-level prediction tasks of different areas and show its superiority compared to other methods.

Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

TL;DR

Abstract

Paper Structure (19 sections, 1 theorem, 8 equations, 4 figures, 5 tables)

This paper contains 19 sections, 1 theorem, 8 equations, 4 figures, 5 tables.

Methodology
Spectral Graph Wavelet Tensors
Fast Graph Wavelet Transform
Constructing Long-range Pretraining on Domain-Agnostic Data
Masking grants generalizability
Experiments
Setup
Pretraining
Downstream task
Results
Moleculenet and Long Range Graph Benchmark
TUDataset Benchmark
Different Graph Connectivity Patterns
ZINC Dataset
Ablation Study
...and 4 more sections

Key Result

Theorem 1

For any $\epsilon>0$ and real coefficients $\theta_1,\theta_2,\dots,\theta_d$, there exists a HOPE-WavePE $\varphi:\mathbb{R}^{n\times n\times d}\to\mathbb{R}^{n\times n}$ such that

Figures (4)

Figure 1: Visualization of graph Wavelet on the geometric graph of a torus. The low scaling factor $s$ results in a highly localized structure around the center node (yellow), while higher factors can lead to smoother signals that can spread out to a larger part of the graph with scaling factor 4, 15 and 50 respectively.
Figure 2: Our proposed equivariant autoencoder pretraining scheme are applied on a large multiple domain dataset while extending the feature degree, overcoming the domain-specific weakness while also embedding long-range information.
Figure 3: Average reconstruction accuracy on Peptides-struct graphs with different wavelet channel quantities.
Figure 4: Masked vs unmasked reconstruction accuracy on MNIST dataset. The regions shaded red and blue contains hop lengths $s$ in which $\bold{A}_s$ are and are not all ones, respectively.

Theorems & Definitions (1)

Theorem 1

Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

TL;DR

Abstract

Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (1)