Qrlew: Rewriting SQL into Differentially Private SQL
Nicolas Grislain, Paul Roussel, Victoria de Sainte Agathe
TL;DR
Qrlew tackles the challenge of making differential privacy practical for SQL analytics by allowing data practitioners to write standard SQL queries whose results are rewritten into differentially private equivalents. It introduces a Relation-based intermediate representation, range propagation with $k$-Intervals and piecewise-monotonic functions, and a privacy unit definition to track ownership across related tables, all feeding into a two-phase rewriting process that yields a DP compatible query. The rewriting uses rule allocation and application to propagate privacy through DP aggregations with Gaussian noise and tau-thresholding for grouping keys, with privacy accounting via an $\text{RDP}$ accountant, and executes entirely within standard SQL back-ends. Compared with existing DP libraries and systems, Qrlew emphasizes an SQL interface, in-database DP execution, and end-to-end automatic rewriting, reducing integration friction for real-world analytics while identifying current limitations and avenues for future work.
Abstract
This paper introduces Qrlew, an open source library that can parse SQL queries into Relations -- an intermediate representation -- that keeps track of rich data types, value ranges, and row ownership; so that they can easily be rewritten into differentially-private equivalent and turned back into SQL queries for execution in a variety of standard data stores. With Qrlew, a data practitioner can express their data queries in standard SQL; the data owner can run the rewritten query without any technical integration and with strong privacy guarantees on the output; and the query rewriting can be operated by a privacy-expert who must be trusted by the owner, but may belong to a separate organization.
