Table of Contents
Fetching ...

A Formalism and Library for Database Visualization

Eugene Wu, Xiang Yu Tuang, Antonio Li, Vareesh Bainwala

TL;DR

This work defines database visualization as a constraint-driven mapping from relational database schemas and constraints to visual representations, ensuring faithfulness by preserving keys and foreign-key relations in the visuals. It extends traditional single-table graphical grammars with foreign attributes and explicit/implicit representations of foreign-key constraints, enabling faithful multi-table visualizations. The authors implement a JavaScript library, dvl, that compiles constraint-based specifications into SQL task graphs and renders layouts, supporting single-table and multi-table workflows, including spatial nesting and shared-scales designs. They further demonstrate the approach with case studies, ER diagrams, and space-filling layouts via HiVE, arguing that many classic visual designs (e.g., node-link, parallel coordinates, facets) naturally emerge from relational modeling decisions. The work lays a foundational theory for a broader space of database visualizations and outlines future directions for scalability, extended constraints, and interactive analysis.

Abstract

Existing data visualization formalisms are restricted to single-table inputs, which makes existing visualization grammars like Vega-lite or ggplot2 tedious to use, have overly complex APIs, and unsound when visualization multi-table data. This paper presents the first visualization formalism to support databases as input -- in other words, *database visualization*. A visualization specification is defined as a mapping from database constraints (e.g., schemas, types, foreign keys) to visual representations of those constraints, and we state that a visualization is *faithful* if it visually preserves the underlying database constraints. This formalism explains how visualization designs are the result of implicit data modeling decisions. We further develop a javascript library called dvl and use a series of case studies to show its expressiveness over specialized visualization systems and existing grammar-based languages.

A Formalism and Library for Database Visualization

TL;DR

This work defines database visualization as a constraint-driven mapping from relational database schemas and constraints to visual representations, ensuring faithfulness by preserving keys and foreign-key relations in the visuals. It extends traditional single-table graphical grammars with foreign attributes and explicit/implicit representations of foreign-key constraints, enabling faithful multi-table visualizations. The authors implement a JavaScript library, dvl, that compiles constraint-based specifications into SQL task graphs and renders layouts, supporting single-table and multi-table workflows, including spatial nesting and shared-scales designs. They further demonstrate the approach with case studies, ER diagrams, and space-filling layouts via HiVE, arguing that many classic visual designs (e.g., node-link, parallel coordinates, facets) naturally emerge from relational modeling decisions. The work lays a foundational theory for a broader space of database visualizations and outlines future directions for scalability, extended constraints, and interactive analysis.

Abstract

Existing data visualization formalisms are restricted to single-table inputs, which makes existing visualization grammars like Vega-lite or ggplot2 tedious to use, have overly complex APIs, and unsound when visualization multi-table data. This paper presents the first visualization formalism to support databases as input -- in other words, *database visualization*. A visualization specification is defined as a mapping from database constraints (e.g., schemas, types, foreign keys) to visual representations of those constraints, and we state that a visualization is *faithful* if it visually preserves the underlying database constraints. This formalism explains how visualization designs are the result of implicit data modeling decisions. We further develop a javascript library called dvl and use a series of case studies to show its expressiveness over specialized visualization systems and existing grammar-based languages.

Paper Structure

This paper contains 32 sections, 17 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: In the node link visualization, (a) the links $V_{E'}$ only appear to connect the points $V_N$. (b) The points and links become inconsistent if $V_N$ changes (e.g., adding jitter to the y position). (c) dvl preserves the foreign key relationships by referring to the mark positions (e.g., $V_{N}[s].(x,y)$) in the visual mapping $V_{E}$.
  • Figure 2: The foreign attribute expression $V_N[E.t].(x,y)$ follows foreign key relationships to index into $N[\textcolor{blue}{2}]$ using $\textcolor{blue}{E.t}$, index into $V_N[\textcolor{red}{2}]$ using $\textcolor{red}{N.id}$, then retrieve the $x,y$ mark properties.
  • Figure 3: Using foreign attributes (in purple) preserves foreign key relationships between the nodes and edges tables and between their views $V_E$ and $V_N$. The single-table approach (in red) does not preserve the relationship between $V_N$ and $V_{E'}$, and may break the relationship between $N$ and $E$ as well.
  • Figure 4: Constraint is explicitly represented as text that equates ids from $S$ and $T$. The user can identify the corresponding $T$ mark by finding the point with the label, and the $S$ mark by its position along the y axis.
  • Figure 5: Foreign key relationship between $S$ and $T$ is preserved by (a) shared scale domains and can be progressively reinforced by (b) shared scale ranges, (c) shared channels, (d) absolute alignment of the views, and (e) spatial proximity.
  • ...and 5 more figures

Theorems & Definitions (18)

  • Example 1
  • Example 2
  • Example 3
  • Example 4
  • Example 5
  • Example 6
  • Example 7
  • Example 8
  • Example 9
  • Example 10
  • ...and 8 more