Table of Contents
Fetching ...

From Primes to Paths: Enabling Fast Multi-Relational Graph Analysis

Konstantinos Bougiatiotis, Georgios Paliouras

TL;DR

The Prime Adjacency Matrices framework is enhanced by introducing a lossless algorithm for calculating the multi-hop matrices and the Bag of Paths (BoP) representation is proposed, a versatile feature extraction methodology for various graph analytics tasks, at the node, edge, and graph level.

Abstract

Multi-relational networks capture intricate relationships in data and have diverse applications across fields such as biomedical, financial, and social sciences. As networks derived from increasingly large datasets become more common, identifying efficient methods for representing and analyzing them becomes crucial. This work extends the Prime Adjacency Matrices (PAMs) framework, which employs prime numbers to represent distinct relations within a network uniquely. This enables a compact representation of a complete multi-relational graph using a single adjacency matrix, which, in turn, facilitates quick computation of multi-hop adjacency matrices. In this work, we enhance the framework by introducing a lossless algorithm for calculating the multi-hop matrices and propose the Bag of Paths (BoP) representation, a versatile feature extraction methodology for various graph analytics tasks, at the node, edge, and graph level. We demonstrate the efficiency of the framework across various tasks and datasets, showing that simple BoP-based models perform comparably to or better than commonly used neural models while offering improved speed and interpretability.

From Primes to Paths: Enabling Fast Multi-Relational Graph Analysis

TL;DR

The Prime Adjacency Matrices framework is enhanced by introducing a lossless algorithm for calculating the multi-hop matrices and the Bag of Paths (BoP) representation is proposed, a versatile feature extraction methodology for various graph analytics tasks, at the node, edge, and graph level.

Abstract

Multi-relational networks capture intricate relationships in data and have diverse applications across fields such as biomedical, financial, and social sciences. As networks derived from increasingly large datasets become more common, identifying efficient methods for representing and analyzing them becomes crucial. This work extends the Prime Adjacency Matrices (PAMs) framework, which employs prime numbers to represent distinct relations within a network uniquely. This enables a compact representation of a complete multi-relational graph using a single adjacency matrix, which, in turn, facilitates quick computation of multi-hop adjacency matrices. In this work, we enhance the framework by introducing a lossless algorithm for calculating the multi-hop matrices and propose the Bag of Paths (BoP) representation, a versatile feature extraction methodology for various graph analytics tasks, at the node, edge, and graph level. We demonstrate the efficiency of the framework across various tasks and datasets, showing that simple BoP-based models perform comparably to or better than commonly used neural models while offering improved speed and interpretability.

Paper Structure

This paper contains 23 sections, 1 theorem, 31 equations, 6 figures, 10 tables, 2 algorithms.

Key Result

Theorem B.1

Every positive integer $n > 1$ can be represented in exactly one way as a product of prime powers: where $p_1 < p_2 < ... < p_k$ are primes and $n_i$ positive integers.

Figures (6)

  • Figure 1: An example multi-relational graph with 5 nodes and 3 types of relation.
  • Figure 2: (a) Simple graph with multiple relations between two nodes. (b) The same graph with node 1 split into 2 substitute nodes, namely 1a and 1b.
  • Figure 3: Illustration of the lossless generation of the value for $P^3[D,E]$, corresponding to two $3$-hop paths. (a) The lossless $P, P^2$ PAMs and the mappings $\phi_{1},\phi_{2},\phi_{3}$ that will be used. (b) The process of generating $P^3[D,E]$ without loss, through the aggregation (AP) and chaining (CP) steps.
  • Figure 4: Illustration of the Bag of Paths extraction methodology across multiple scales, i.e. node-, edge- and graph-level tasks.
  • Figure 5: Time needed to run the two Bag of Paths variants across datasets, while varying the number of $k$-hop PAMs calculated. The y-axis is in log-scale and time is measured in seconds. Missing points indicate that the time needed exceeded 2.7 hours.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem B.1: Fundamental Theorem of Arithmetic - FTA