Table of Contents
Fetching ...

Symbolic Quantitative Information Flow for Probabilistic Programs

Philipp Schröer, Francesca Randone, Raúl Pardo, Andrzej Wąsowski

TL;DR

The paper develops symbolic methods to quantify information leakage in probabilistic programs by coupling two semantic frameworks: discrete weakest pre-expectation (WPE) and continuous Gaussian mixture semantics (SOGA). It provides exact symbolic formulas for discrete programs and principled Gaussian-mixture approximations with bound guarantees for continuous ones, enabling exact or bounded computation of entropy, conditional entropy, KL divergence, and mutual information. The approach includes sufficient conditions under which SOGA aligns with the exact semantics, and demonstrates applicability to differential privacy mechanisms, including randomized response and the Gaussian mechanism. By providing case studies, the work shows how attacker priors and privacy parameters shape information leakage and offers a scalable, semantics-driven alternative to sampling-based or model-counting methods. This contributes to robust, worst-case leakage analysis in data-intensive applications and supports precise privacy guarantees in probabilistic programming contexts.

Abstract

It is of utmost importance to ensure that modern data intensive systems do not leak sensitive information. In this paper, the authors, who met thanks to Joost-Pieter Katoen, discuss symbolic methods to compute information-theoretic measures of leakage: entropy, conditional entropy, Kullback-Leibler divergence, and mutual information. We build on two semantic frameworks for symbolic execution of probabilistic programs. For discrete programs, we use weakest pre-expectation calculus to compute exact symbolic expressions for the leakage measures. Using Second Order Gaussian Approximation (SOGA), we handle programs that combine discrete and continuous distributions. However, in the SOGA setting, we approximate the exact semantics using Gaussian mixtures and compute bounds for the measures. We demonstrate the use of our methods in two widely used mechanisms to ensure differential privacy: randomized response and the Gaussian mechanism.

Symbolic Quantitative Information Flow for Probabilistic Programs

TL;DR

The paper develops symbolic methods to quantify information leakage in probabilistic programs by coupling two semantic frameworks: discrete weakest pre-expectation (WPE) and continuous Gaussian mixture semantics (SOGA). It provides exact symbolic formulas for discrete programs and principled Gaussian-mixture approximations with bound guarantees for continuous ones, enabling exact or bounded computation of entropy, conditional entropy, KL divergence, and mutual information. The approach includes sufficient conditions under which SOGA aligns with the exact semantics, and demonstrates applicability to differential privacy mechanisms, including randomized response and the Gaussian mechanism. By providing case studies, the work shows how attacker priors and privacy parameters shape information leakage and offers a scalable, semantics-driven alternative to sampling-based or model-counting methods. This contributes to robust, worst-case leakage analysis in data-intensive applications and supports precise privacy guarantees in probabilistic programming contexts.

Abstract

It is of utmost importance to ensure that modern data intensive systems do not leak sensitive information. In this paper, the authors, who met thanks to Joost-Pieter Katoen, discuss symbolic methods to compute information-theoretic measures of leakage: entropy, conditional entropy, Kullback-Leibler divergence, and mutual information. We build on two semantic frameworks for symbolic execution of probabilistic programs. For discrete programs, we use weakest pre-expectation calculus to compute exact symbolic expressions for the leakage measures. Using Second Order Gaussian Approximation (SOGA), we handle programs that combine discrete and continuous distributions. However, in the SOGA setting, we approximate the exact semantics using Gaussian mixtures and compute bounds for the measures. We demonstrate the use of our methods in two widely used mechanisms to ensure differential privacy: randomized response and the Gaussian mechanism.

Paper Structure

This paper contains 23 sections, 8 theorems, 43 equations, 3 figures, 3 tables, 2 algorithms.

Key Result

proposition thmcounterproposition

For a program $S$ and variable $x_i$, the entropy $H(x_i)$ of the value of $x_i$ on termination of $S$ can be computed as follows:

Figures (3)

  • Figure 1: Calculation of $\mathsf{wp}\llbracket{S}\rrbracket([{\underline{{o_i}} = o_i}])$ to compute the entropy of outputs. The calculation proceeds bottom-up, with the final result being in the first line.
  • Figure 2: Entropy of outputs $H(o_i)$.
  • Figure 3: Mutual information $\textrm{I}(r_i; o_i)$ between secret response and output.

Theorems & Definitions (12)

  • proposition thmcounterproposition
  • proof
  • proposition thmcounterproposition
  • proposition thmcounterproposition
  • proposition thmcounterproposition
  • proof
  • proposition thmcounterproposition
  • proof
  • proposition thmcounterproposition: Bounds for entropy of GMs huber2008entropy
  • proposition thmcounterproposition: Bounds for KL divergence hershey2007approximating
  • ...and 2 more