Table of Contents
Fetching ...

Extracting node comparison insights for the interactive exploration of property graphs

Cristina Aguiar, Jacques Chabin, Alexandre Chanson, Mirian Halfeld-Ferrari, Nicolas Hiot, Nicolas Labroche, Patrick Marcel, Verónika Peralta, Felipe Vasconcelos

TL;DR

The paper addresses automatic extraction of node comparison insights in property graphs by introducing context-based indicators derived from node contexts and graph topology. It formalizes a problem of selecting indicators for grouping and comparing nodes, framed as a 3-partition optimization, and proposes several heuristics (Laplacian, local search, clustering, and random-start variants) to scale to real-world graphs. An end-to-end pipeline covers indicator collection and insight computation, leveraging percentile scaling and path-length contextualization, with Cypher-driven data extraction. Empirical evaluation on diverse real-world datasets demonstrates that simple heuristics rapidly yield actionable insights, while more sophisticated methods yield higher-quality results, enabling interactive exploratory data analysis on property graphs.

Abstract

While scoring nodes in graphs to understand their importance (e.g., in terms of centrality) has been investigated for decades, comparing nodes in property graphs based on their properties has not, to our knowledge, yet been addressed. In this paper, we propose an approach to automatically extract comparison of nodes in property graphs, to support the interactive exploratory analysis of said graphs. We first present a way of devising comparison indicators using the context of nodes to be compared. Then, we formally define the problem of using these indicators to group the nodes so that the comparisons extracted are both significant and not straightforward. We propose various heuristics for solving this problem. Our tests on real property graph databases show that simple heuristics can be used to obtain insights within minutes while slower heuristics are needed to obtain insights of higher quality.

Extracting node comparison insights for the interactive exploration of property graphs

TL;DR

The paper addresses automatic extraction of node comparison insights in property graphs by introducing context-based indicators derived from node contexts and graph topology. It formalizes a problem of selecting indicators for grouping and comparing nodes, framed as a 3-partition optimization, and proposes several heuristics (Laplacian, local search, clustering, and random-start variants) to scale to real-world graphs. An end-to-end pipeline covers indicator collection and insight computation, leveraging percentile scaling and path-length contextualization, with Cypher-driven data extraction. Empirical evaluation on diverse real-world datasets demonstrates that simple heuristics rapidly yield actionable insights, while more sophisticated methods yield higher-quality results, enabling interactive exploratory data analysis on property graphs.

Abstract

While scoring nodes in graphs to understand their importance (e.g., in terms of centrality) has been investigated for decades, comparing nodes in property graphs based on their properties has not, to our knowledge, yet been addressed. In this paper, we propose an approach to automatically extract comparison of nodes in property graphs, to support the interactive exploratory analysis of said graphs. We first present a way of devising comparison indicators using the context of nodes to be compared. Then, we formally define the problem of using these indicators to group the nodes so that the comparisons extracted are both significant and not straightforward. We propose various heuristics for solving this problem. Our tests on real property graph databases show that simple heuristics can be used to obtain insights within minutes while slower heuristics are needed to obtain insights of higher quality.

Paper Structure

This paper contains 30 sections, 3 equations, 11 figures, 6 tables, 3 algorithms.

Figures (11)

  • Figure 1: Property graph $G$ of Example \ref{['ex:running']}.
  • Figure 2: Property graph type $S$ for Example \ref{['ex:running']}
  • Figure 3: Lazy vs eager validation (total time (s))
  • Figure 4: Lazy vs eager validation (candidate indicator collection time (s))
  • Figure 5: CPU time of the 4 heuristics
  • ...and 6 more figures

Theorems & Definitions (20)

  • Example 1
  • Definition 1: Property graph
  • Example 2
  • Definition 2: Property graph type
  • Example 3
  • Definition 3: Element validity and type instance
  • Definition 4: Instance of a graph type
  • Example 4
  • Definition 5: Relationship cardinalities with respect to its endpoints
  • Definition 6: Path
  • ...and 10 more