Table of Contents
Fetching ...

KGPrune: a Web Application to Extract Subgraphs of Interest from Wikidata with Analogical Pruning

Pierre Monnin, Cherif-Hassan Nousradine, Lucas Jarnac, Laurel Zuckerman, Miguel Couceiro

TL;DR

KGPrune is a Web Application that, given seed entities of interest and properties to traverse, extracts their neighboring subgraphs from Wikidata and relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones.

Abstract

Knowledge graphs (KGs) have become ubiquitous publicly available knowledge sources, and are nowadays covering an ever increasing array of domains. However, not all knowledge represented is useful or pertaining when considering a new application or specific task. Also, due to their increasing size, handling large KGs in their entirety entails scalability issues. These two aspects asks for efficient methods to extract subgraphs of interest from existing KGs. To this aim, we introduce KGPrune, a Web Application that, given seed entities of interest and properties to traverse, extracts their neighboring subgraphs from Wikidata. To avoid topical drift, KGPrune relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones. The interest of KGPrune is illustrated by two concrete applications, namely, bootstrapping an enterprise KG and extracting knowledge related to looted artworks.

KGPrune: a Web Application to Extract Subgraphs of Interest from Wikidata with Analogical Pruning

TL;DR

KGPrune is a Web Application that, given seed entities of interest and properties to traverse, extracts their neighboring subgraphs from Wikidata and relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones.

Abstract

Knowledge graphs (KGs) have become ubiquitous publicly available knowledge sources, and are nowadays covering an ever increasing array of domains. However, not all knowledge represented is useful or pertaining when considering a new application or specific task. Also, due to their increasing size, handling large KGs in their entirety entails scalability issues. These two aspects asks for efficient methods to extract subgraphs of interest from existing KGs. To this aim, we introduce KGPrune, a Web Application that, given seed entities of interest and properties to traverse, extracts their neighboring subgraphs from Wikidata. To avoid topical drift, KGPrune relies on a frugal pruning algorithm based on analogical reasoning to only keep relevant neighbors while pruning irrelevant ones. The interest of KGPrune is illustrated by two concrete applications, namely, bootstrapping an enterprise KG and extracting knowledge related to looted artworks.
Paper Structure (9 sections, 1 equation, 2 figures, 2 tables)

This paper contains 9 sections, 1 equation, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Main screens of KGPrune. (a) Upload form where two CSV files are required from the user: one indicating QIDs of seed entities and one indicating PIDs of properties to traverse. (b) Result page where the user can choose to visualize the extracted subgraphs or download them in JSON or RDF. (c) Visualization of the extracted subgraphs where seed entities are in yellow, kept neighbors in green, and pruned neighbors in red.
  • Figure 2: Technical architecture of KGPrune. Users can upload subgraph extraction tasks via the website or the API. These tasks are then submitted as SLURM jobs to our computing cluster.