Table of Contents
Fetching ...

A Chase-based Approach to Consistent Answers of Analytic Queries in Star Schemas

Dominique Laurent, Nicolas Spyratos

TL;DR

The paper tackles the problem of obtaining consistent answers for analytic queries in star-schema data warehouses that may contain inconsistencies and missing values. It extends previous Chase-based approaches by employing an mChase framework to compute repairs and derive true/consistent tuples in star-tables. Under restrictions that the selection condition does not involve keys and is independent (a conjunction of single-attribute conditions), it proves that exact consistent answers for projection–selection–join queries and analytic queries can be computed in polynomial time in the warehouse size $|T|$. This work thus enables robust analytic querying over imperfect data warehouses and lays groundwork for practical, scalable implementations in data analytics while connecting to established notions of repairs and functional dependencies.

Abstract

We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exact consistent answer to a standard, non analytic query or bounds of the exact answer, depending on whether the query involves a selection condition or not. We extend this approach to computing exact consistent answers of analytic queries over star schemas, provided that the selection condition in the query involves no keys and satisfies the property of independency (i.e., the condition can be expressed as a conjunction of conditions each involving a single attribute). The main contributions of this paper are: (a) a polynomial algorithm for computing the exact consistent answer to a usual projection-selection-join query over a star schema under the above restrictions on the selection condition, and (b) showing that, under the same restrictions the exact consistent answer to an analytic query over a star schema can be computed in time polynomial in the size of the data warehouse.

A Chase-based Approach to Consistent Answers of Analytic Queries in Star Schemas

TL;DR

The paper tackles the problem of obtaining consistent answers for analytic queries in star-schema data warehouses that may contain inconsistencies and missing values. It extends previous Chase-based approaches by employing an mChase framework to compute repairs and derive true/consistent tuples in star-tables. Under restrictions that the selection condition does not involve keys and is independent (a conjunction of single-attribute conditions), it proves that exact consistent answers for projection–selection–join queries and analytic queries can be computed in polynomial time in the warehouse size . This work thus enables robust analytic querying over imperfect data warehouses and lays groundwork for practical, scalable implementations in data analytics while connecting to established notions of repairs and functional dependencies.

Abstract

We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exact consistent answer to a standard, non analytic query or bounds of the exact answer, depending on whether the query involves a selection condition or not. We extend this approach to computing exact consistent answers of analytic queries over star schemas, provided that the selection condition in the query involves no keys and satisfies the property of independency (i.e., the condition can be expressed as a conjunction of conditions each involving a single attribute). The main contributions of this paper are: (a) a polynomial algorithm for computing the exact consistent answer to a usual projection-selection-join query over a star schema under the above restrictions on the selection condition, and (b) showing that, under the same restrictions the exact consistent answer to an analytic query over a star schema can be computed in time polynomial in the size of the data warehouse.

Paper Structure

This paper contains 1 section.

Table of Contents

  1. Introduction