Table of Contents
Fetching ...

Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs

Elika Bozorgi, Saber Soleimani, Sakher Khalil Alqaiidi, Hamid Reza Arabnia, Krzysztof Kochut

TL;DR

This paper tackles scalable knowledge graph embedding by introducing Subgraph2vec, a random-walk-based method that confines walks to a user-defined schema subgraph within the KG. Walks are generated inside the subgraph and embedded with a modified skip-gram model, producing vector representations used for link prediction. Empirical results on YAGO39K and NELL show Subgraph2vec surpassing regpattern2vec and metapath2vec in most cases, demonstrating the benefit of user-guided, subgraph-constrained walks. The approach offers a flexible, unsupervised avenue for KG embedding with potential extensions like edge weighting and broader downstream tasks.

Abstract

Graph is an important data representation which occurs naturally in the real world applications \cite{goyal2018graph}. Therefore, analyzing graphs provides users with better insights in different areas such as anomaly detection \cite{ma2021comprehensive}, decision making \cite{fan2023graph}, clustering \cite{tsitsulin2023graph}, classification \cite{wang2021mixup} and etc. However, most of these methods require high levels of computational time and space. We can use other ways like embedding to reduce these costs. Knowledge graph (KG) embedding is a technique that aims to achieve the vector representation of a KG. It represents entities and relations of a KG in a low-dimensional space while maintaining the semantic meanings of them. There are different methods for embedding graphs including random walk-based methods such as node2vec, metapath2vec and regpattern2vec. However, most of these methods bias the walks based on a rigid pattern usually hard-coded in the algorithm. In this work, we introduce \textit{subgraph2vec} for embedding KGs where walks are run inside a user-defined subgraph. We use this embedding for link prediction and prove our method has better performance in most cases in comparison with the previous ones.

Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs

TL;DR

This paper tackles scalable knowledge graph embedding by introducing Subgraph2vec, a random-walk-based method that confines walks to a user-defined schema subgraph within the KG. Walks are generated inside the subgraph and embedded with a modified skip-gram model, producing vector representations used for link prediction. Empirical results on YAGO39K and NELL show Subgraph2vec surpassing regpattern2vec and metapath2vec in most cases, demonstrating the benefit of user-guided, subgraph-constrained walks. The approach offers a flexible, unsupervised avenue for KG embedding with potential extensions like edge weighting and broader downstream tasks.

Abstract

Graph is an important data representation which occurs naturally in the real world applications \cite{goyal2018graph}. Therefore, analyzing graphs provides users with better insights in different areas such as anomaly detection \cite{ma2021comprehensive}, decision making \cite{fan2023graph}, clustering \cite{tsitsulin2023graph}, classification \cite{wang2021mixup} and etc. However, most of these methods require high levels of computational time and space. We can use other ways like embedding to reduce these costs. Knowledge graph (KG) embedding is a technique that aims to achieve the vector representation of a KG. It represents entities and relations of a KG in a low-dimensional space while maintaining the semantic meanings of them. There are different methods for embedding graphs including random walk-based methods such as node2vec, metapath2vec and regpattern2vec. However, most of these methods bias the walks based on a rigid pattern usually hard-coded in the algorithm. In this work, we introduce \textit{subgraph2vec} for embedding KGs where walks are run inside a user-defined subgraph. We use this embedding for link prediction and prove our method has better performance in most cases in comparison with the previous ones.
Paper Structure (11 sections, 2 equations, 1 figure, 1 table)

This paper contains 11 sections, 2 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Comparing ROC of different links of Subgraph2vec, Regpattern2vec and Metapath2vec on NELL and YAGO.