Database Views as Explanations for Relational Deep Learning

Agapi Rissaki; Ilias Fountalis; Wolfgang Gatterbauer; Benny Kimelfeld

Database Views as Explanations for Relational Deep Learning

Agapi Rissaki, Ilias Fountalis, Wolfgang Gatterbauer, Benny Kimelfeld

TL;DR

This work addresses the opacity of relational deep learning models by introducing a framework where explanations are SQL-style view definitions over the database, grounded in a soft-determinacy notion that tolerates realistic perturbations. It offers a model-agnostic approach and a GNN-specific instantiation using learnable masks to identify concise, influential database components (columns, joins, and selections) that explain predictions. Empirical evaluation on RelBench demonstrates high-quality explanations with favorable runtime, and case studies show practical diagnostic capabilities such as detecting data leakage and identifying structural signals. The framework thus enables interpretable, database-centric insights for powerful relational predictive models with broad applicability to real-world datasets and tasks.

Abstract

In recent years, there has been significant progress in the development of deep learning models over relational databases, including architectures based on heterogeneous graph neural networks (hetero-GNNs) and heterogeneous graph transformers. In effect, such architectures state how the database records and links (e.g., foreign-key references) translate into a large, complex numerical expression, involving numerous learnable parameters. This complexity makes it hard to explain, in human-understandable terms, how a model uses the available data to arrive at a given prediction. We present a novel framework for explaining machine-learning models over relational databases, where explanations are view definitions that highlight focused parts of the database that mostly contribute to the model's prediction. We establish such global abductive explanations by adapting the classic notion of determinacy by Nash, Segoufin, and Vianu (2010). In addition to tuning the tradeoff between determinacy and conciseness, the framework allows controlling the level of granularity by adopting different fragments of view definitions, such as ones highlighting whole columns, foreign keys between tables, relevant groups of tuples, and so on. We investigate the realization of the framework in the case of hetero-GNNs, and develop a model-specific approach via the notion of learnable masks. For comparison, we propose model-agnostic heuristic baselines and show that our approach is both more efficient and achieves better explanation quality in most cases. Our extensive empirical evaluation on the RelBench collection across diverse domains and record-level tasks demonstrates both the usefulness of our explanations and the efficiency of their generation.

Database Views as Explanations for Relational Deep Learning

TL;DR

Abstract

Database Views as Explanations for Relational Deep Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)

Theorems & Definitions (7)