Table of Contents
Fetching ...

HyperQuery: Beyond Binary Link Prediction

Sepideh Maleki, Josh Vekhter, Keshav Pingali

TL;DR

HyperQuery tackles the challenge of predicting higher-order relations in both simple and knowledge hypergraphs by proposing a self-supervised framework that uses clustering-driven global features to bootstrap node and hyperedge representations. It introduces a novel Edge2Edge convolution through alternating edge-to-node and node-to-edge message passing, augmented by a bilinear aggregation and clustering-based initializations to enable both hyperedge prediction and knowledge hypergraph completion without heavy labeling. Across knowledge hypergraph benchmarks (e.g., FB-AUTO, M-FB15K, JF17K) and hyperedge prediction datasets (e.g., iAF1260b, iJO1366, USPTO, DBLP), HyperQuery achieves state-of-the-art results and demonstrates robust ablations showing the efficacy of Omega choices and bilinear pooling. The method offers a scalable, explainable approach for reasoning about n-ary relations and sets the stage for solving more complex hyperqueries by fusing local message passing with global structural signals.

Abstract

Groups with complex set intersection relations are a natural way to model a wide array of data, from the formation of social groups to the complex protein interactions which form the basis of biological life. One approach to representing such higher order relationships is as a hypergraph. However, efforts to apply machine learning techniques to hypergraph structured datasets have been limited thus far. In this paper, we address the problem of link prediction in knowledge hypergraphs as well as simple hypergraphs and develop a novel, simple, and effective optimization architecture that addresses both tasks. Additionally, we introduce a novel feature extraction technique using node level clustering and we show how integrating data from node-level labels can improve system performance. Our self-supervised approach achieves significant improvement over state of the art baselines on several hyperedge prediction and knowledge hypergraph completion benchmarks.

HyperQuery: Beyond Binary Link Prediction

TL;DR

HyperQuery tackles the challenge of predicting higher-order relations in both simple and knowledge hypergraphs by proposing a self-supervised framework that uses clustering-driven global features to bootstrap node and hyperedge representations. It introduces a novel Edge2Edge convolution through alternating edge-to-node and node-to-edge message passing, augmented by a bilinear aggregation and clustering-based initializations to enable both hyperedge prediction and knowledge hypergraph completion without heavy labeling. Across knowledge hypergraph benchmarks (e.g., FB-AUTO, M-FB15K, JF17K) and hyperedge prediction datasets (e.g., iAF1260b, iJO1366, USPTO, DBLP), HyperQuery achieves state-of-the-art results and demonstrates robust ablations showing the efficacy of Omega choices and bilinear pooling. The method offers a scalable, explainable approach for reasoning about n-ary relations and sets the stage for solving more complex hyperqueries by fusing local message passing with global structural signals.

Abstract

Groups with complex set intersection relations are a natural way to model a wide array of data, from the formation of social groups to the complex protein interactions which form the basis of biological life. One approach to representing such higher order relationships is as a hypergraph. However, efforts to apply machine learning techniques to hypergraph structured datasets have been limited thus far. In this paper, we address the problem of link prediction in knowledge hypergraphs as well as simple hypergraphs and develop a novel, simple, and effective optimization architecture that addresses both tasks. Additionally, we introduce a novel feature extraction technique using node level clustering and we show how integrating data from node-level labels can improve system performance. Our self-supervised approach achieves significant improvement over state of the art baselines on several hyperedge prediction and knowledge hypergraph completion benchmarks.
Paper Structure (31 sections, 7 equations, 5 figures, 6 tables)

This paper contains 31 sections, 7 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: HyperQuery processing pipeline: Consider the hypergraph structured data illustrated above (left). Our system predicts the label of the red hyperedge by performing regression (right) based on an embedding (center, grey line) produced by the hyperedge convolution operation that we developed.
  • Figure 2: Generating Useful Labels using clustering: Not all hypergraph data comes with labels (left). We leverage hypergraph clustering to first assign cluster id as labels to the nodes, obtaining an initial cluster assignment $h^0_v$ (center), which we then propagate to the hyperedges via Max Pooling (right) in order to obtain the initial edge cluster assignment $h^0_e$.
  • Figure 3: An Edge2Edge Convolution Operator: Here we illustrate the function $[h_v^k,h_e^k] = E2E(S, h_e^{k-1})$. Our query set $S$ is annotated in blue on (left). For each $v_i \in S$, we perform an E2N aggregation, and in the process, compute $h_v^k$ (center left). We then summarize the distribution of node features using some choice of function $\Omega$ to obtain a statistical feature we call $hs_e^k$ (center right). Passing these features through a trainable weight matrix completes the operator (right).
  • Figure 4: Performance of HyperQuery for different numbers of clusters.
  • Figure 5: Learning $W^1$. Given a hypergraph with labels on nodes and edges, we experiment with a zoo of choices for $\Omega$. In this illustration, we convolve once, multiply by $W^1$ and $\sigma$ and then pass through a fully connected layer to return to a space of the same dimension of hyperedge types. Finally we then train as an autoencoder.