Revisiting the Expressiveness Landscape of Data Graph Queries

Michael Benedikt; Anthony Widjaja Lin; Di-De Yen

Revisiting the Expressiveness Landscape of Data Graph Queries

Michael Benedikt, Anthony Widjaja Lin, Di-De Yen

TL;DR

This work analyzes the expressive power of graph query languages for data graphs, focusing on three canonical families: $RPQ$-based extensions, Walk Logic ($WL$), and first-order logic with transitive closure, and shows how data coupling adds complexity. It demonstrates that $FO(ERDPQ)$ subsumes several existing languages ($WL$, $RDPQ$, $GPC$) while $FO^*( ext{≡data})$ subsumes $RDPQ$ but is incomparable with others, outlining a rich landscape of expressiveness. To unify these approaches, the paper introduces $FO^*(ERDPQ)$, extending $FO(ERDPQ)$ with transitive closure, which subsumes all prior languages and provides a single maximal framework, albeit with non-elementary worst-case data complexity. Additionally, it introduces Multi-Path Walk Logic (MWL), an extension of WL with multi-path comparisons, which is strictly more expressive than WL and is expressible within $FO(ERDPQ)$ but does not reach the full unifying power of $FO^*(ERDPQ)$. The results offer a conceptual and technical bridge for graph querying, guiding future work on tractable fragments and practical implementations.

Abstract

The study of graph queries in database theory has spanned more than three decades, resulting in a multitude of proposals for graph query languages. These languages differ in the mechanisms. We can identify three main families of languages, with the canonical representatives being: (1) regular path queries, (2) walk logic, and (3) first-order logic with transitive closure operators. This paper provides a complete picture of the expressive power of these languages in the context of data graphs. Specifically, we consider a graph data model that supports querying over both data and topology. For example, "Does there exist a path between two different persons in a social network with the same last name?". We also show that an extension of (1), augmented with transitive closure operators, can unify the expressivity of (1)--(3) without increasing the query evaluation complexity.

Revisiting the Expressiveness Landscape of Data Graph Queries

TL;DR

This work analyzes the expressive power of graph query languages for data graphs, focusing on three canonical families:

-based extensions, Walk Logic (

), and first-order logic with transitive closure, and shows how data coupling adds complexity. It demonstrates that

subsumes several existing languages (

) while

subsumes

but is incomparable with others, outlining a rich landscape of expressiveness. To unify these approaches, the paper introduces

, extending

with transitive closure, which subsumes all prior languages and provides a single maximal framework, albeit with non-elementary worst-case data complexity. Additionally, it introduces Multi-Path Walk Logic (MWL), an extension of WL with multi-path comparisons, which is strictly more expressive than WL and is expressible within

but does not reach the full unifying power of

. The results offer a conceptual and technical bridge for graph querying, guiding future work on tractable fragments and practical implementations.

Revisiting the Expressiveness Landscape of Data Graph Queries

TL;DR

Abstract

Revisiting the Expressiveness Landscape of Data Graph Queries

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (34)