Efficient Query Repair for Aggregate Constraints
Shatha Algarni, Boris Glavic, Seokki Lee, Adriane Chapman
TL;DR
The paper addresses repairing a user query so that its result satisfies complex aggregate constraints expressed as arithmetic combinations of aggregates, including non-monotone constraints like fairness measures. It introduces two pruning-based repair frameworks, algff and algrp, that use kd-tree clustering and interval arithmetic to reuse aggregations and bound constraint results, enabling efficient top-k repairs. The authors formalize the problem, prove correctness aspects, and demonstrate substantial runtime gains over brute-force baselines and prior work across diverse datasets and constraints. The work enables enforcing sophisticated constraints in query results (e.g., SPD fairness) without drastically altering user intent, with practical impact for fair and compliant data retrieval.
Abstract
In many real-world scenarios, query results must satisfy domain-specific constraints. For instance, a minimum percentage of interview candidates selected based on their qualifications should be female. These requirements can be expressed as constraints over an arithmetic combination of aggregates evaluated on the result of the query. In this work, we study how to repair a query to fulfill such constraints by modifying the filter predicates of the query. We introduce a novel query repair technique that leverages bounds on sets of candidate solutions and interval arithmetic to efficiently prune the search space. We demonstrate experimentally, that our technique significantly outperforms baselines that consider a single candidate at a time.
