Incremental Consistent Updating of Incomplete Databases

Jacques Chabin; Mirian Halfeld Ferrari; Nicolas Hiot; Dominique Laurent

Incremental Consistent Updating of Incomplete Databases

Jacques Chabin, Mirian Halfeld Ferrari, Nicolas Hiot, Dominique Laurent

TL;DR

The paper addresses maintaining consistency in dynamic incomplete databases under a fixed set of tuple-generating dependencies. It extends prior work with an incremental, disk-based updating framework that bounds chase growth using a maximal null degree $\delta_{max}$ and a core-based simplification via LinkedNull. Two implementations are presented: a graph-database (Neo4j) using Cypher for chasing and LinkedNull, and a relational-model approach (MySQL) for comparable functionality, with results showing comparable global performance but varying step costs due to schema design. Experimental evaluation on Movie, GOT, and LDBC data demonstrates scalability beyond main memory and highlights the impact of graph-schema choices on incremental processing efficiency, with reproducibility provided. Overall, the work offers a scalable approach to incremental consistency maintenance for incomplete data in both graph and relational DBMS contexts.

Abstract

Efficient consistency maintenance of incomplete and dynamic real-life databases is a quality label for further data analysis. In prior work, we tackled the generic problem of database updating in the presence of tuple generating constraints from a theoretical viewpoint. The current paper considers the usability of our approach by (a) introducing incremental update routines (instead of the previous from-scratch versions) and (b) removing the restriction that limits the contents of the database to fit in the main memory. In doing so, this paper offers new algorithms, proposes queries and data models inviting discussions on the representation of incompleteness on databases. We also propose implementations under a graph database model and the traditional relational database model. Our experiments show that computation times are similar globally but point to discrepancies in some steps.

Incremental Consistent Updating of Incomplete Databases

TL;DR

and a core-based simplification via LinkedNull. Two implementations are presented: a graph-database (Neo4j) using Cypher for chasing and LinkedNull, and a relational-model approach (MySQL) for comparable functionality, with results showing comparable global performance but varying step costs due to schema design. Experimental evaluation on Movie, GOT, and LDBC data demonstrates scalability beyond main memory and highlights the impact of graph-schema choices on incremental processing efficiency, with reproducibility provided. Overall, the work offers a scalable approach to incremental consistency maintenance for incomplete data in both graph and relational DBMS contexts.

Abstract

Paper Structure (18 sections, 2 theorems, 11 figures, 7 algorithms)

This paper contains 18 sections, 2 theorems, 11 figures, 7 algorithms.

Introduction
Motivating Example
From-scratch and incremental approaches at a glance.
Preliminaries
Simplification with Respect to Nulls: a Basic Operation
Incremental Updating
Insertion
Deletion
Queries for Incremental Processing
Graph Data Model
Query for chasing.
Query to find LinkedNull sets.
Relational Data Model
Discussion
Experimental Results
...and 3 more sections

Key Result

Proposition 1

Given $h_i$ and $h_{i'}$ in $q_{core}(I)$, $h_i \preceq h_{i'}$ holds if and only if, for every $j=1 , \dots , p$, we have:

Figures (11)

Figure 1: Set of (general) constraints
Figure 2: Queries used in our algorithms
Figure 3: Graph database schema.
Figure 4: Graph database instance (extract). Optimization labels and attributes are omitted.
Figure 5: Cypher template for chasing
...and 6 more figures

Theorems & Definitions (11)

Example 1
Example 2
Definition 1
Proposition 1
Example 3
Corollary 1
Example 4
Example 5
Example 6
Example 7
...and 1 more

Incremental Consistent Updating of Incomplete Databases

TL;DR

Abstract

Incremental Consistent Updating of Incomplete Databases

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (11)