Table of Contents
Fetching ...

Incremental Consistent Updating of Incomplete Databases

Jacques Chabin, Mirian Halfeld Ferrari, Nicolas Hiot, Dominique Laurent

TL;DR

The paper addresses maintaining consistency in dynamic incomplete databases under a fixed set of tuple-generating dependencies. It extends prior work with an incremental, disk-based updating framework that bounds chase growth using a maximal null degree $\delta_{max}$ and a core-based simplification via LinkedNull. Two implementations are presented: a graph-database (Neo4j) using Cypher for chasing and LinkedNull, and a relational-model approach (MySQL) for comparable functionality, with results showing comparable global performance but varying step costs due to schema design. Experimental evaluation on Movie, GOT, and LDBC data demonstrates scalability beyond main memory and highlights the impact of graph-schema choices on incremental processing efficiency, with reproducibility provided. Overall, the work offers a scalable approach to incremental consistency maintenance for incomplete data in both graph and relational DBMS contexts.

Abstract

Efficient consistency maintenance of incomplete and dynamic real-life databases is a quality label for further data analysis. In prior work, we tackled the generic problem of database updating in the presence of tuple generating constraints from a theoretical viewpoint. The current paper considers the usability of our approach by (a) introducing incremental update routines (instead of the previous from-scratch versions) and (b) removing the restriction that limits the contents of the database to fit in the main memory. In doing so, this paper offers new algorithms, proposes queries and data models inviting discussions on the representation of incompleteness on databases. We also propose implementations under a graph database model and the traditional relational database model. Our experiments show that computation times are similar globally but point to discrepancies in some steps.

Incremental Consistent Updating of Incomplete Databases

TL;DR

The paper addresses maintaining consistency in dynamic incomplete databases under a fixed set of tuple-generating dependencies. It extends prior work with an incremental, disk-based updating framework that bounds chase growth using a maximal null degree and a core-based simplification via LinkedNull. Two implementations are presented: a graph-database (Neo4j) using Cypher for chasing and LinkedNull, and a relational-model approach (MySQL) for comparable functionality, with results showing comparable global performance but varying step costs due to schema design. Experimental evaluation on Movie, GOT, and LDBC data demonstrates scalability beyond main memory and highlights the impact of graph-schema choices on incremental processing efficiency, with reproducibility provided. Overall, the work offers a scalable approach to incremental consistency maintenance for incomplete data in both graph and relational DBMS contexts.

Abstract

Efficient consistency maintenance of incomplete and dynamic real-life databases is a quality label for further data analysis. In prior work, we tackled the generic problem of database updating in the presence of tuple generating constraints from a theoretical viewpoint. The current paper considers the usability of our approach by (a) introducing incremental update routines (instead of the previous from-scratch versions) and (b) removing the restriction that limits the contents of the database to fit in the main memory. In doing so, this paper offers new algorithms, proposes queries and data models inviting discussions on the representation of incompleteness on databases. We also propose implementations under a graph database model and the traditional relational database model. Our experiments show that computation times are similar globally but point to discrepancies in some steps.
Paper Structure (18 sections, 2 theorems, 11 figures, 7 algorithms)

This paper contains 18 sections, 2 theorems, 11 figures, 7 algorithms.

Key Result

Proposition 1

Given $h_i$ and $h_{i'}$ in $q_{core}(I)$, $h_i \preceq h_{i'}$ holds if and only if, for every $j=1 , \dots , p$, we have:

Figures (11)

  • Figure 1: Set of (general) constraints
  • Figure 2: Queries used in our algorithms
  • Figure 3: Graph database schema.
  • Figure 4: Graph database instance (extract). Optimization labels and attributes are omitted.
  • Figure 5: Cypher template for chasing
  • ...and 6 more figures

Theorems & Definitions (11)

  • Example 1
  • Example 2
  • Definition 1
  • Proposition 1
  • Example 3
  • Corollary 1
  • Example 4
  • Example 5
  • Example 6
  • Example 7
  • ...and 1 more