Table of Contents
Fetching ...

Scalable Precise Computation of Shannon Entropy

Yong Lai, Haolong Tong, Zhenghang Xu, Minghao Yin

TL;DR

This paper focuses on the programs modeled in Boolean constraints and optimizes the two stages of the Shannon entropy computation to implement a scalable precise tool PSE, which is compared with the state-of-the-art probabilistic approximately correct tool EntropyEstimation, which was shown to significantly outperform the previous precise tools.

Abstract

Quantitative information flow analyses (QIF) are a class of techniques for measuring the amount of confidential information leaked by a program to its public outputs. Shannon entropy is an important method to quantify the amount of leakage in QIF. This paper focuses on the programs modeled in Boolean constraints and optimizes the two stages of the Shannon entropy computation to implement a scalable precise tool PSE. In the first stage, we design a knowledge compilation language called \ADDAND that combines Algebraic Decision Diagrams and conjunctive decomposition. \ADDAND avoids enumerating possible outputs of a program and supports tractable entropy computation. In the second stage, we optimize the model counting queries that are used to compute the probabilities of outputs. We compare PSE with the state-of-the-art probabilistic approximately correct tool EntropyEstimation, which was shown to significantly outperform the previous precise tools. The experimental results demonstrate that PSE solved 56 more benchmarks compared to EntropyEstimation in a total of 459. For 98\% of the benchmarks that both PSE and EntropyEstimation solved, PSE is at least $10\times$ as efficient as EntropyEstimation.

Scalable Precise Computation of Shannon Entropy

TL;DR

This paper focuses on the programs modeled in Boolean constraints and optimizes the two stages of the Shannon entropy computation to implement a scalable precise tool PSE, which is compared with the state-of-the-art probabilistic approximately correct tool EntropyEstimation, which was shown to significantly outperform the previous precise tools.

Abstract

Quantitative information flow analyses (QIF) are a class of techniques for measuring the amount of confidential information leaked by a program to its public outputs. Shannon entropy is an important method to quantify the amount of leakage in QIF. This paper focuses on the programs modeled in Boolean constraints and optimizes the two stages of the Shannon entropy computation to implement a scalable precise tool PSE. In the first stage, we design a knowledge compilation language called \ADDAND that combines Algebraic Decision Diagrams and conjunctive decomposition. \ADDAND avoids enumerating possible outputs of a program and supports tractable entropy computation. In the second stage, we optimize the model counting queries that are used to compute the probabilities of outputs. We compare PSE with the state-of-the-art probabilistic approximately correct tool EntropyEstimation, which was shown to significantly outperform the previous precise tools. The experimental results demonstrate that PSE solved 56 more benchmarks compared to EntropyEstimation in a total of 459. For 98\% of the benchmarks that both PSE and EntropyEstimation solved, PSE is at least as efficient as EntropyEstimation.

Paper Structure

This paper contains 13 sections, 2 theorems, 4 equations, 5 figures, 1 table, 1 algorithm.

Key Result

proposition 1

Given a non-terminal node $u$ in $\text{ADD}[\land]$, its weight $\mathit{\omega}(u)$ can be recursively computed as follows in polynomial time: where $n_0 = |\mathit{Vars}(u)| - |\mathit{Vars}(lo(u))| - 1$ and $n_1 = |\mathit{Vars}(u)| - |\mathit{Vars}(hi(u))| - 1$.

Figures (5)

  • Figure 1: An $\text{ADD}[\land]$ representing the probability distribution of $\varphi_n^{sep}$ in Example \ref{['hard-circuit-example']} over its outputs. According to Proposition \ref{['prop:omega-proposition']}, the computed weight for each node is marked in blue font. According to Proposition \ref{['prop:Entropy-proposition']}, the computed entropy for each node is marked in red font.
  • Figure 2: The execution example of PSE on Example \ref{['easy-circuit-example']} follows the variable order of $y_1 \prec y_2 \prec y_3 \prec y_4 \prec y_5$, where the corresponding computational trajectory is represented as an $\text{ADD}[\land]$. The entropy computation process performed by PSE is explicitly annotated in red font. The calculation process of weight (number of models) is presented in blue font.
  • Figure 3: An ADD structure, constructed in Example \ref{['easy-circuit-example']}, follows the variable order of $y_1 \prec y_2 \prec y_3 \prec y_4 \prec y_5$. According to Proposition \ref{['prop:omega-proposition']}, weight is marked in blue font, and according to Proposition \ref{['prop:Entropy-proposition']}, entropy is marked in red font.
  • Figure 4: Scatter Plot of the running time Comparison between PSE and EntropyEstimation.
  • Figure 5: Cactus plot comparing different methods.

Theorems & Definitions (7)

  • definition 1
  • definition 2
  • proposition 1
  • proof
  • proposition 2
  • proof
  • proof