Table of Contents
Fetching ...

HoGA: Higher-Order Graph Attention via Diversity-Aware k-Hop Sampling

Thomas Bailie, Yun Sing Koh, Karthik Mukkavilli

TL;DR

Edge-based MPNNs face limited expressivity for higher-order topology. HoGA introduces a diversity-aware $k$-hop sampling module that builds $K$-hop attention by walking on the $k$-order line graph $L_k(G)$ and aggregating via $oldsymbol{A}_{1:K}(oldsymbol{x}(t))= extstyle\sum_{1\le k\le K}\beta(k)oldsymbol{A}_k(oldsymbol{x}(t),oldsymbol{S}_k)$, with a history-buffer guided heuristic to maximize feature diversity. The sampling strategy reduces redundancy and oversquashing while remaining tractable under a budget $ig|Eig|$, and HoGA can be plugged into existing single-hop models such as GAT and GRAND to yield HoGA-GAT and HoGA-GRAND. Empirically, HoGA improves node classification accuracy across both homophilic and heterophilic benchmarks, often outperforming recent higher-order methods, while maintaining reasonable runtime and memory requirements. This approach provides a scalable pathway to leverage richer topological signals in graphs without prohibitive state-space growth.

Abstract

Graphs model latent variable relationships in many real-world systems, and Message Passing Neural Networks (MPNNs) are widely used to learn such structures for downstream tasks. While edge-based MPNNs effectively capture local interactions, their expressive power is theoretically bounded, limiting the discovery of higher-order relationships. We introduce the Higher-Order Graph Attention (HoGA) module, which constructs a k-order attention matrix by sampling subgraphs to maximize diversity among feature vectors. Unlike existing higher-order attention methods that greedily resample similar k-order relationships, HoGA targets diverse modalities in higher-order topology, reducing redundancy and expanding the range of captured substructures. Applied to two single-hop attention models, HoGA achieves at least a 5% accuracy gain on all benchmark node classification datasets and outperforms recent baselines on six of eight datasets. Code is available at https://github.com/TB862/Higher_Order.

HoGA: Higher-Order Graph Attention via Diversity-Aware k-Hop Sampling

TL;DR

Edge-based MPNNs face limited expressivity for higher-order topology. HoGA introduces a diversity-aware -hop sampling module that builds -hop attention by walking on the -order line graph and aggregating via , with a history-buffer guided heuristic to maximize feature diversity. The sampling strategy reduces redundancy and oversquashing while remaining tractable under a budget , and HoGA can be plugged into existing single-hop models such as GAT and GRAND to yield HoGA-GAT and HoGA-GRAND. Empirically, HoGA improves node classification accuracy across both homophilic and heterophilic benchmarks, often outperforming recent higher-order methods, while maintaining reasonable runtime and memory requirements. This approach provides a scalable pathway to leverage richer topological signals in graphs without prohibitive state-space growth.

Abstract

Graphs model latent variable relationships in many real-world systems, and Message Passing Neural Networks (MPNNs) are widely used to learn such structures for downstream tasks. While edge-based MPNNs effectively capture local interactions, their expressive power is theoretically bounded, limiting the discovery of higher-order relationships. We introduce the Higher-Order Graph Attention (HoGA) module, which constructs a k-order attention matrix by sampling subgraphs to maximize diversity among feature vectors. Unlike existing higher-order attention methods that greedily resample similar k-order relationships, HoGA targets diverse modalities in higher-order topology, reducing redundancy and expanding the range of captured substructures. Applied to two single-hop attention models, HoGA achieves at least a 5% accuracy gain on all benchmark node classification datasets and outperforms recent baselines on six of eight datasets. Code is available at https://github.com/TB862/Higher_Order.

Paper Structure

This paper contains 15 sections, 2 theorems, 17 equations, 7 figures, 3 tables.

Key Result

Theorem 1

Let $C = (j_1, \dots, j_L)$ be any cycle of length $L$. The probability that a walk traverses the cycle exactly in order is: The edge weights $\omega_{i,q,\tau}$ are updated each walk iteration by the non-greedy component of the walk, and are given by: with $\delta_q(H_\tau)$ denoting the history buffer term at time $\tau$, and $f$ a dissimilarity function.

Figures (7)

  • Figure 1: A $40$ node undirected random graph, initialized via the Erdos–Renyi algorithm. The probability of edge existence is set to $8\%$, while the random seed is $41$. Coloring represents the magnitude of feature-vector values between $0$ and $1$. Shown in bold red edges are $15$ steps of a greedy walk and the heuristically guided walk of HoGA. These walks aim to maximize diversity. Greedy is prone to getting stuck in cycles, while HoGA is able to escape them, given its diversity heuristic.
  • Figure 2: The Higher-order Graphical Attention (HoGA) module. (a) an input graph of arbitrary topology. (b) HoGA samples the $k$-hop neighborhood up to a maximum value of $K$ via a heuristic walk. (c) The sampling results create an adjacency matrix describing connections via a shortest path of length $k$. (d) Higher-order aggregation combines nodal information of variable distance, thus recreating the initial graph with self-attention edge weights.
  • Figure 3: Our higher-order attention module aggregates weights from a single-hop self-attention method by weighting contributions proportional to proximity.
  • Figure 4: The history buffer stores concepts previously seen in the $k$-hop neighborhood to avoid repetitively resampling, allowing for greater capture of diverse higher-order relationships.
  • Figure 5: Sensitivity tests with standard deviations across 20 iterations; (a)- (b) Varying maximal hop number for both HoGA-GAT and HoGA-GRAND models respectively, (c) scaling factor multiplying $\beta(k)$, (d) Relative accuracy under a variable number of layers.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Definition 1: $k$-Order Line Graph
  • Theorem 1: Sampling Repetition on Cycles
  • Corollary 1: History buffer on Cycles