Table of Contents
Fetching ...

Approximate Integrity Constraints in Incomplete Databases With Limited Domains

Munqath Al-atar, Attila Sali

TL;DR

This work extends the notion of strongly possible constraints to multivalued dependencies and cross joins in incomplete databases by restricting imputations to the active domain. It introduces the add-based approximation measure $g_5$ and proves $g_3(K) \ge g_5(K)$ for $sp$Keys and $sp$FDs, illustrating that additions can be more effective than deletions in achieving constraint satisfaction. The paper also defines $sp$MVDs and $sp$CJs, analyzes their complexity (noting that single-case checking can be polynomial while general $sp$CJ verification is NP-complete), and presents a comprehensive comparison of approximation measures across different constraint types. Overall, the results offer a framework for imputing data and assessing near-satisfaction of strong constraints in incomplete data, with implications for data imputation strategies and constraint-based data cleaning.

Abstract

In case of incomplete database tables, a possible world is obtained by replacing any missing value by a value from the corresponding attribute's domain that can be infinite. A possible key or possible functional dependency constraint is satisfied by an incomplete table if we can obtain a possible world that satisfies the given key or functional dependency. On the other hand, a certain key or certain functional dependency holds if all possible worlds satisfy the constraint, A strongly possible constraint is an intermediate concept between possible and certain constraints, based on the strongly possible world approach (a strongly possible world is obtained by replacing \nul's by a value from the ones appearing in the corresponding attribute of the table). A strongly possible key or functional dependency holds in an incomplete table if there exists a strongly possible world that satisfies the given constraint. In the present paper, we introduce strongly possible versions of multivalued dependencies and cross joins, and we analyse the complexity of checking the validity of a given strongly possible cross joins. We also study approximation measures of strongly possible keys (spKeys), functional dependencies (spFDs), multivalued dependencies (spMVDs) and cross joins (spCJs). We also treat complexity questions of determination of the approximation values.

Approximate Integrity Constraints in Incomplete Databases With Limited Domains

TL;DR

This work extends the notion of strongly possible constraints to multivalued dependencies and cross joins in incomplete databases by restricting imputations to the active domain. It introduces the add-based approximation measure and proves for Keys and FDs, illustrating that additions can be more effective than deletions in achieving constraint satisfaction. The paper also defines MVDs and CJs, analyzes their complexity (noting that single-case checking can be polynomial while general CJ verification is NP-complete), and presents a comprehensive comparison of approximation measures across different constraint types. Overall, the results offer a framework for imputing data and assessing near-satisfaction of strong constraints in incomplete data, with implications for data imputation strategies and constraint-based data cleaning.

Abstract

In case of incomplete database tables, a possible world is obtained by replacing any missing value by a value from the corresponding attribute's domain that can be infinite. A possible key or possible functional dependency constraint is satisfied by an incomplete table if we can obtain a possible world that satisfies the given key or functional dependency. On the other hand, a certain key or certain functional dependency holds if all possible worlds satisfy the constraint, A strongly possible constraint is an intermediate concept between possible and certain constraints, based on the strongly possible world approach (a strongly possible world is obtained by replacing \nul's by a value from the ones appearing in the corresponding attribute of the table). A strongly possible key or functional dependency holds in an incomplete table if there exists a strongly possible world that satisfies the given constraint. In the present paper, we introduce strongly possible versions of multivalued dependencies and cross joins, and we analyse the complexity of checking the validity of a given strongly possible cross joins. We also study approximation measures of strongly possible keys (spKeys), functional dependencies (spFDs), multivalued dependencies (spMVDs) and cross joins (spCJs). We also treat complexity questions of determination of the approximation values.
Paper Structure (23 sections, 12 theorems, 23 equations, 2 figures, 20 tables)

This paper contains 23 sections, 12 theorems, 23 equations, 2 figures, 20 tables.

Key Result

proposition 1

Let $T$ be an instance over schema $R$ and let $K\subseteq R$. If the $K$-total part of the table $T$ satisfies the key $sp \left\langle K \right\rangle$, then there exists a minimum set of tuples $U$ to be removed that are all non-$K$-total so that $T\setminus U$ satisfies $sp \left\langle K \right

Figures (2)

  • Figure 1: $spMVd$ and $NMVD$ Tables
  • Figure 2: Gadget for $3DM\prec \mathbf{General spCJ}$

Theorems & Definitions (34)

  • definition 1
  • definition 2
  • definition 3
  • definition 4
  • definition 5
  • definition 6
  • proposition 1
  • proof
  • definition 7
  • proposition 2
  • ...and 24 more