Table of Contents
Fetching ...

Computing Range Consistent Answers to Aggregation Queries via Rewriting

Aziz Amezian El Khalfioui, Jef Wijsen

TL;DR

This work investigates queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators, and particularly investigates queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators.

Abstract

We consider the problem of answering conjunctive queries with aggregation on database instances that may violate primary key constraints. In SQL, these queries follow the SELECT-FROM-WHERE-GROUP BY format, where the WHERE-clause involves a conjunction of equalities, and the SELECT-clause can incorporate aggregate operators like MAX, MIN, SUM, AVG, or COUNT. Repairs of a database instance are defined as inclusion-maximal subsets that satisfy all primary keys. For a given query, our primary objective is to identify repairs that yield the lowest aggregated value among all possible repairs. We particularly investigate queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators.

Computing Range Consistent Answers to Aggregation Queries via Rewriting

TL;DR

This work investigates queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators, and particularly investigates queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators.

Abstract

We consider the problem of answering conjunctive queries with aggregation on database instances that may violate primary key constraints. In SQL, these queries follow the SELECT-FROM-WHERE-GROUP BY format, where the WHERE-clause involves a conjunction of equalities, and the SELECT-clause can incorporate aggregate operators like MAX, MIN, SUM, AVG, or COUNT. Repairs of a database instance are defined as inclusion-maximal subsets that satisfy all primary keys. For a given query, our primary objective is to identify repairs that yield the lowest aggregated value among all possible repairs. We particularly investigate queries for which this lowest aggregated value can be determined through a rewriting in first-order logic with aggregate operators.
Paper Structure (44 sections, 30 theorems, 48 equations, 5 figures)

This paper contains 44 sections, 30 theorems, 48 equations, 5 figures.

Key Result

theorem 1

The following decision problem is decidable in quadratic time (in the size of the input): Given as input a numerical query $g()$ in $\mathsf{AGGR}[\mathsf{sjfBCQ}]$ whose aggregate operator is both monotone and associative, is ${\mathsf{GLB- CQA}}(g())$ expressible in $\mathsf{AGGR}[\mathsf{FOL}]$?

Figures (5)

  • Figure 1: Database instance $\mathbf{db}_{\mathsf{Stock}}$. Blocks are Separated by Dashed Lines.
  • Figure 2: Attack graphs for two queries in $\mathsf{sjfBCQ}$: The query on the right is derived from the query on the left by initializing $x$ to $b$ and $y$ to $c$.
  • Figure 3: Example database instance $\mathbf{db}_0$, and the set $M_{0}$ of all $\forall$embeddings of $q_{0}$ into $\mathbf{db}_{0}$..
  • Figure 4: Computation of a $\mathcal{F}_{\mathtt{SUM}}$-minimal MCS relative to $\{{x}\rightarrow{y}, {yz}\rightarrow{r}\}$.
  • Figure 5: Calculation of both an $\mathcal{F}_{\mathtt{SUM}}$-minimal MCS (formula $\phi_{2}$) and ${\mathsf{GLB- CQA}}(g_{0}())$ for $\mathtt{SUM}(r)\leftarrow R(\underline{x},y), S(\underline{y,z},d,r)$.

Theorems & Definitions (73)

  • theorem 1: Separation Theorem
  • theorem 2: DBLP:journals/tods/KoutrisW17
  • lemma 1
  • lemma 2
  • lemma 3
  • definition 1
  • theorem 3
  • theorem 4
  • definition 2
  • lemma 4
  • ...and 63 more