SQL4NN: Validation and expressive querying of models as data
Mark Gerarts, Juno Steegmans, Jan Van den Bussche
TL;DR
SQL4NN reframes trained neural networks as intensional data that can be stored and queried inside a relational database, enabling validation, verification, and white-box analyses over both training data and learned models. The authors demonstrate that neural networks can be encoded as Node/Edge relations and evaluated via SQL views, while leveraging recursion to handle variable-depth architectures and exploiting the piecewise-linear nature of ReLU activations. They connect practical in-database evaluation and verification to theoretical results showing first-order logic over the reals with linear constraints can be simulated in SQL for fixed depths, and they showcase white-box tasks such as geometry reconstruction and pruning, including the computation of breakpoints with $-u.bias / w$. The work provides a proof-of-concept demo using DuckDB and PyTorch on MNIST-scale models, highlighting the potential for integrated, explainable model analytics within database systems.
Abstract
We consider machine learning models, learned from data, to be an important, intensional, kind of data in themselves. As such, various analysis tasks on models can be thought of as queries over this intensional data, often combined with extensional data such as data for training or validation. We demonstrate that relational database systems and SQL can actually be well suited for many such tasks.
