Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting

Lisa Oakley; Steven Holtzen; Alina Oprea

Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting

Lisa Oakley, Steven Holtzen, Alina Oprea

TL;DR

This work develops a method for tight privacy and accuracy bound synthesis using weighted model counting on binary decision diagrams, a state of the art technique from the artificial intelligence and automated reasoning communities for exactly computing probability distributions.

Abstract

Programmatically generating tight differential privacy (DP) bounds is a hard problem. Two core challenges are (1) finding expressive, compact, and efficient encodings of the distributions of DP algorithms, and (2) state space explosion stemming from the multiple quantifiers and relational properties of the DP definition. We address the first challenge by developing a method for tight privacy and accuracy bound synthesis using weighted model counting on binary decision diagrams, a state-of-the-art technique from the artificial intelligence and automated reasoning communities for exactly computing probability distributions. We address the second challenge by developing a framework for leveraging inherent symmetries in DP algorithms. Our solution benefits from ongoing research in probabilistic programming languages, allowing us to succinctly and expressively represent different DP algorithms with approachable language syntax that can be used by non-experts. We provide a detailed case study of our solution on the binary randomized response algorithm. We also evaluate an implementation of our solution using the Dice probabilistic programming language for the randomized response and truncated geometric above threshold algorithms. We compare to prior work on exact DP verification using Markov chain probabilistic model checking and the decision procedure DiPC. Very few existing works consider mechanized analysis of accuracy guarantees for DP algorithms. We additionally provide a detailed analysis using our technique for finding tight accuracy bounds for DP algorithms.

Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting

TL;DR

Abstract

Paper Structure (36 sections, 11 theorems, 11 equations, 4 figures, 3 tables, 5 algorithms)

This paper contains 36 sections, 11 theorems, 11 equations, 4 figures, 3 tables, 5 algorithms.

Introduction
Preliminaries
Differential Privacy
Accuracy
Weighted Model Counting
Model Counting
Weighted Boolean Formula (WBF)
Weighted Model Counting (WMC)
WMC for Probabilistic Algorithms
Weighted Model Counting for Synthesizing Tight Privacy and Accuracy Bounds
Synthesis Problems
WMC and Probabilistic Programming For Privacy Bound Synthesis
WMC for Exhaustive Accuracy Bound Synthesis
Framework for Leveraging Symmetries
Inference Algorithm
...and 21 more sections

Key Result

Theorem 1

Given randomized $A:{\mathcal{X}^n}\rightarrow{\mathcal{Y}}$, with discrete, finite ${\mathcal{X}^n}$ and ${\mathcal{Y}}$, set of target values $\mathcal{V}$, and $\alpha\geq 0$, the $\mathbf{x}$ which minimizes $\sum_{y\in[\mathcal{V}_\mathbf{x}-\alpha,\mathcal{V}_\mathbf{x}+\alpha]}P(A(\mathbf{x})

Figures (4)

Figure 1: We represent the probability distribution of a randomized algorithm as a binary decision diagram (BDD) so that we can efficiently compute the weighted model count to perform probabilistic inference (compute the probabilities of certain events occurring). Starting from a randomized algorithm, there are many ways to compile to a BDD. The top path shows a more manual process where a user crafts a weighted boolean formula by hand and uses a knowledge compiler, a tool for efficiently representing and querying logical formulas, to compile this into a BDD. The bottom path exhibits an example of the flow of an expressive probabilistic programming languages which allow the user to define an easily readable program defined similarly to the pseudocode. From here, the programming language can compile an optimized logical expression and pass it to the knowledge compiler. In this way, there is very little manual effort, and the resulting logical expression and BDD are optimized for WMC queries.
Figure 2: BDD sizes for the truncated geometric above threshold algorithm for various maximum integer sizes ($k$) and list lengths ($n$).
Figure 3: BDD sizes and experiment durations for accuracy bound synthesis for randomized response with counting over various numbers of clients $n$ and $\lambda=0.2$.
Figure 4: Tight Accuracy and privacy bounds over varying coin flip parameters, $\lambda$, for binary randomized response with $n=8$. Generated using the Dice restricted privacy and accuracy bound synthesis implementation.

Theorems & Definitions (26)

Definition 1: $\varepsilon$-Differential Privacy
Definition 2: ($\alpha,\beta$)-Accuracy
Definition 3: WBF of a probabilistic algorithm
Theorem 1
proof
Definition 4: Privacy Set
Theorem 2
proof
Definition 5: Accuracy Set
Theorem 3
...and 16 more

Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting

TL;DR

Abstract

Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (26)