Table of Contents
Fetching ...

On The Reasonable Effectiveness of Relational Diagrams: Explaining Relational Query Patterns and the Pattern Expressiveness of Relational Languages

Wolfgang Gatterbauer, Cody Dunne

TL;DR

This work introduces a language-independent semantic notion of relational query patterns and a complete, sound diagrammatic representation called Relational Diagrams. By focusing on pattern-expressiveness rather than mere logical expressiveness, the authors establish a hierarchy among $Datalog^{*}$, $RA^{*}$, $TRC^{*}$, and $SQL^{*}$, and show that Relational Diagrams can faithfully capture all patterns in the non-disjunctive fragment, unlike RA-based diagrams. They extend Diagrams with union cells to achieve relational completeness and verify the approach via textbook analyses and a controlled user study, which show users recognize patterns more quickly and accurately than with SQL. The practical impact lies in improving query understanding and education by providing a visualization that preserves the underlying query pattern across schemas and languages. They also discuss limitations and future work toward representing disjunction and more complex SQL features diagrammatically.

Abstract

Comparing relational languages by their logical expressiveness is well understood. Less well understood is how to compare relational languages by their ability to represent relational query patterns. Indeed, what are query patterns other than "a certain way of writing a query"? And how can query patterns be defined across procedural and declarative languages, irrespective of their syntax? To the best of our knowledge, we provide the first semantic definition of relational query patterns by using a variant of structure-preserving mappings between the relational tables of queries. This formalism allows us to analyze the relative pattern expressiveness of relational language fragments and create a hierarchy of languages with equal logical expressiveness yet different pattern expressiveness. Notably, for the non-disjunctive language fragment, we show that relational calculus can express a larger class of patterns than the basic operators of relational algebra. Our language-independent definition of query patterns opens novel paths for assisting database users. For example, these patterns could be leveraged to create visual query representations that faithfully represent query patterns, speed up interpretation, and provide visual feedback during query editing. As a concrete example, we propose Relational Diagrams, a complete and sound diagrammatic representation of safe relational calculus that is provably (i) unambiguous, (ii) relationally complete, and (iii) able to represent all query patterns for unions of non-disjunctive queries. Among all diagrammatic representations for relational queries that we are aware of, ours is the only one with these three properties. Furthermore, our anonymously preregistered user study shows that Relational Diagrams allow users to recognize patterns meaningfully faster and more accurately than SQL.

On The Reasonable Effectiveness of Relational Diagrams: Explaining Relational Query Patterns and the Pattern Expressiveness of Relational Languages

TL;DR

This work introduces a language-independent semantic notion of relational query patterns and a complete, sound diagrammatic representation called Relational Diagrams. By focusing on pattern-expressiveness rather than mere logical expressiveness, the authors establish a hierarchy among , , , and , and show that Relational Diagrams can faithfully capture all patterns in the non-disjunctive fragment, unlike RA-based diagrams. They extend Diagrams with union cells to achieve relational completeness and verify the approach via textbook analyses and a controlled user study, which show users recognize patterns more quickly and accurately than with SQL. The practical impact lies in improving query understanding and education by providing a visualization that preserves the underlying query pattern across schemas and languages. They also discuss limitations and future work toward representing disjunction and more complex SQL features diagrammatically.

Abstract

Comparing relational languages by their logical expressiveness is well understood. Less well understood is how to compare relational languages by their ability to represent relational query patterns. Indeed, what are query patterns other than "a certain way of writing a query"? And how can query patterns be defined across procedural and declarative languages, irrespective of their syntax? To the best of our knowledge, we provide the first semantic definition of relational query patterns by using a variant of structure-preserving mappings between the relational tables of queries. This formalism allows us to analyze the relative pattern expressiveness of relational language fragments and create a hierarchy of languages with equal logical expressiveness yet different pattern expressiveness. Notably, for the non-disjunctive language fragment, we show that relational calculus can express a larger class of patterns than the basic operators of relational algebra. Our language-independent definition of query patterns opens novel paths for assisting database users. For example, these patterns could be leveraged to create visual query representations that faithfully represent query patterns, speed up interpretation, and provide visual feedback during query editing. As a concrete example, we propose Relational Diagrams, a complete and sound diagrammatic representation of safe relational calculus that is provably (i) unambiguous, (ii) relationally complete, and (iii) able to represent all query patterns for unions of non-disjunctive queries. Among all diagrammatic representations for relational queries that we are aware of, ours is the only one with these three properties. Furthermore, our anonymously preregistered user study shows that Relational Diagrams allow users to recognize patterns meaningfully faster and more accurately than SQL.
Paper Structure (62 sections, 8 theorems, 74 equations, 47 figures, 1 table)

This paper contains 62 sections, 8 theorems, 74 equations, 47 figures, 1 table.

Key Result

Theorem 1

[Logical expressiveness] $\textsf{Datalog}^{*}$, $\textsf{RA}^{*}$, $\textsf{TRC}^{*}$, and $\textsf{SQL}^{*}$ have the same logical expressiveness.

Figures (47)

  • Figure 1: DFQL DBLP:journals/vlc/CatarciCLB97 visualization of the $\textsf{TRC}$ query from \ref{['ex:intro_allboats']}. Notice the 3 instances of the Sailor relation and thus a different "structure" of the visualization from the original query.
  • Figure 2: Relational Diagrams representations of the two queries from \ref{['ex:intro_allboats']} (cowbook:2002) and \ref{['ex:intro_comparing_textbooks']} (date2004introduction). Notice the similar "relational query patterns."
  • Figure 3: EBNF Grammar of $\textsf{SQL}^{*}$: Statements enclosed in [ ] are optional; statements separated by $\mid$ indicate a choice between alternatives; parentheses without quotation marks ( ) group alternative choices; parentheses with quotation marks '(' ')' form part of the test. Additionally, the main query requires the DISTINCT keyword (if non-Boolean), and all join and selection predicates need to be guarded (\ref{['def:anchor']}), i.e., reference at least one table within the scope of the last NOT.
  • Figure 4: Example $\textsf{SQL}\xspace$ with disjunction.
  • Figure 5: \ref{['sec:fromTRCtoRD']}: Example $\textsf{TRC}^{*}$ expression (a), derivation of the negation hierarchy (b, c), and corresponding Relational Diagram* (d). Colored partitions $q_i$ (purple) and table variables $r_i$ (blue) are not part of the diagrams and shown only to discuss the correspondence. \ref{['sec:fromRDtoTRC']}: $\textsf{TRC}^{*}$ stub after step 2 of the translation (e).
  • ...and 42 more figures

Theorems & Definitions (56)

  • Example 1: Understanding the structure of a $\textsf{TRC}$ query
  • Example 2: Comparing $\textsf{RC}$ queries from textbooks
  • definition 1: $\textsf{Datalog}^{*}$
  • definition 2: $\textsf{RA}^{*}$
  • definition 3: Guarded predicate
  • definition 4: $\textsf{TRC}^{*}$
  • definition 5
  • Theorem 1
  • definition 6: Validity
  • Theorem 2: Unambiguous Relational Diagrams*
  • ...and 46 more