Transforming Property Graphs
Angela Bonifati, Filip Murlak, Yann Ramusat
TL;DR
This work introduces a declarative framework for transforming property graphs by using Graph Pattern Calculus (GPC) to express data-valued transformations and identities. It provides a Skolem-based mechanism to generate output identifiers, content constructors to define labels and properties, and an operational semantics that yields a well-formed output graph; it also proves that consistency and GPC satisfiability are $\,\mathrm{PSPACE}\,$-complete, motivating runtime conflict handling. A proof-of-concept implementation translates the rules into openCypher scripts, with empirical evaluation on realistic benchmarks (including iBench-derived cases and the Offshore Leaks dataset) showing scalable performance and readability improvements over handcrafted scripts. The results highlight practical benefits for data interoperability and graph transformation tasks, while acknowledging the inherent static-analysis complexity and endorsing dynamic conflict detection during execution as a viable approach.
Abstract
In this paper, we study a declarative framework for specifying transformations of property graphs. In order to express such transformations, we leverage queries formulated in the Graph Pattern Calculus (GPC), which is an abstraction of the common core of recent standard graph query languages, GQL and SQL/PGQ. In contrast to previous frameworks targeting graph topology only, we focus on the impact of data values on the transformations--which is crucial in addressing users needs. In particular, we study the complexity of checking if the transformation rules do not specify conflicting values for properties, and we show this is closely related to the satisfiability problem for GPC. We prove that both problems are PSpace-complete. We have implemented our framework in openCypher. We show the flexibility and usability of our framework by leveraging an existing data integration benchmark, adapting it to our needs. We also evaluate the incurred overhead of detecting potential inconsistencies at run-time, and the impact of several optimization tools in a Cypher-based graph database, by providing a comprehensive comparison of different implementation variants. The results of our experimental study show that our framework exhibits large practical benefits for transforming property graphs compared to ad-hoc transformation scripts.
