Causality-Based Scores Alignment in Explainable Data Management
Felipe Azua, Leopoldo Bertossi
TL;DR
This paper addresses whether different attribution scores used to explain query answers in relational databases induce compatible tuple rankings. It centers on causality-based scores, particularly CES and Resp, and provides a syntactic dichotomy for Boolean conjunctive queries (BCQs) that characterizes when CES and Resp are aligned across all databases, including those containing exogenous tuples. The authors also compare CES/Resp with the Shapley value, showing that alignment with Shapley can fail in the presence of exogenous tuples, and they propose reductions to reduced queries to transfer alignment results. The work lays a foundation for understanding when different explanation scores concur, highlighting exogenous tuples as a key factor, and it identifies open questions for broader query classes and score types with practical implications for explainable data management.
Abstract
Different attribution scores have been proposed to quantify the relevance of database tuples for query answering in databases; e.g. Causal Responsibility, the Shapley Value, the Banzhaf Power-Index, and the Causal Effect. They have been analyzed in isolation. This work is a first investigation of score alignment depending on the query and the database; i.e. on whether they induce compatible rankings of tuples. We concentrate mostly on causality-based scores; and provide a syntactic dichotomy result for queries: on one side, pairs of scores are always aligned, on the other, they are not always aligned. It turns out that the presence of exogenous tuples makes a crucial difference in this regard.
