Table of Contents
Fetching ...

AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning

Michael Färber, David Lamprecht, Yuni Susanti

TL;DR

This work tackles the gap between RDF semantics and graph machine learning by introducing AutoRDF2GML, a framework that semi-automatically converts RDF data into ready-to-use heterogeneous graph datasets. It supports both content-based features derived from RDF datatype properties and topology-based features from RDF object properties, enabling diverse ML tasks such as link prediction and node classification. The authors also present new RDF-based benchmarks (SOA-SW, LPWC, and additional datasets) to enable rigorous evaluation of GML approaches on semantic graphs. The framework is designed for accessibility via a single-file configuration and pip installation, bridging the Semantic Web and Graph ML communities and facilitating scalable, RDF-based ML applications.

Abstract

In this paper, we introduce AutoRDF2GML, a framework designed to convert RDF data into data representations tailored for graph machine learning tasks. AutoRDF2GML enables, for the first time, the creation of both content-based features -- i.e., features based on RDF datatype properties -- and topology-based features -- i.e., features based on RDF object properties. Characterized by automated feature extraction, AutoRDF2GML makes it possible even for users less familiar with RDF and SPARQL to generate data representations ready for graph machine learning tasks, such as link prediction, node classification, and graph classification. Furthermore, we present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs using our framework. These datasets serve as valuable resources for evaluating graph machine learning approaches, such as graph neural networks. Overall, our framework effectively bridges the gap between the Graph Machine Learning and Semantic Web communities, paving the way for RDF-based machine learning applications.

AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning

TL;DR

This work tackles the gap between RDF semantics and graph machine learning by introducing AutoRDF2GML, a framework that semi-automatically converts RDF data into ready-to-use heterogeneous graph datasets. It supports both content-based features derived from RDF datatype properties and topology-based features from RDF object properties, enabling diverse ML tasks such as link prediction and node classification. The authors also present new RDF-based benchmarks (SOA-SW, LPWC, and additional datasets) to enable rigorous evaluation of GML approaches on semantic graphs. The framework is designed for accessibility via a single-file configuration and pip installation, bridging the Semantic Web and Graph ML communities and facilitating scalable, RDF-based ML applications.

Abstract

In this paper, we introduce AutoRDF2GML, a framework designed to convert RDF data into data representations tailored for graph machine learning tasks. AutoRDF2GML enables, for the first time, the creation of both content-based features -- i.e., features based on RDF datatype properties -- and topology-based features -- i.e., features based on RDF object properties. Characterized by automated feature extraction, AutoRDF2GML makes it possible even for users less familiar with RDF and SPARQL to generate data representations ready for graph machine learning tasks, such as link prediction, node classification, and graph classification. Furthermore, we present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs using our framework. These datasets serve as valuable resources for evaluating graph machine learning approaches, such as graph neural networks. Overall, our framework effectively bridges the gap between the Graph Machine Learning and Semantic Web communities, paving the way for RDF-based machine learning applications.
Paper Structure (16 sections, 5 figures, 6 tables)

This paper contains 16 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Overview of AutoRDF2GML.
  • Figure 2: Example n-ary relation.
  • Figure 3: Example multi-hop relation from Linked Papers With Code.
  • Figure 4: Overview heterogeneous graph datasest SOA-SW.
  • Figure 5: Overview heterogeneous graph datasest LPWC.