REAEDP: Entropy-Calibrated Differentially Private Data Release with Formal Guarantees and Attack-Based Evaluation

Bo Ma; Jinsong Wu; Wei Qi Yan

REAEDP: Entropy-Calibrated Differentially Private Data Release with Formal Guarantees and Attack-Based Evaluation

Bo Ma, Jinsong Wu, Wei Qi Yan

Abstract

Sensitive data release is vulnerable to output-side privacy threats such as membership inference, attribute inference, and record linkage. This creates a practical need for release mechanisms that provide formal privacy guarantees while preserving utility in measurable ways. We propose REAEDP, a differential privacy framework that combines entropy-calibrated histogram release, a synthetic-data release mechanism, and attack-based evaluation. On the theory side, we derive an explicit sensitivity bound for Shannon entropy, together with an extension to Rényi entropy, for adjacent histogram datasets, enabling calibrated differentially private release of histogram statistics. We further study a synthetic-data mechanism $\mathcal{F}$ with a privacy-test structure and show that it satisfies a formal differential privacy guarantee under the stated parameter conditions. On multiple public tabular datasets, the empirical entropy change remains below the theoretical bound in the tested regime, standard Laplace and Gaussian baselines exhibit comparable trends, and both membership-inference and linkage-style attack performance move toward random-guess behavior as the privacy parameter decreases. These results support REAEDP as a practically usable privacy-preserving release pipeline in the tested settings. Source code: https://github.com/mabo1215/REAEDP.git

REAEDP: Entropy-Calibrated Differentially Private Data Release with Formal Guarantees and Attack-Based Evaluation

Abstract

with a privacy-test structure and show that it satisfies a formal differential privacy guarantee under the stated parameter conditions. On multiple public tabular datasets, the empirical entropy change remains below the theoretical bound in the tested regime, standard Laplace and Gaussian baselines exhibit comparable trends, and both membership-inference and linkage-style attack performance move toward random-guess behavior as the privacy parameter decreases. These results support REAEDP as a practically usable privacy-preserving release pipeline in the tested settings. Source code: https://github.com/mabo1215/REAEDP.git

Paper Structure (60 sections, 9 theorems, 42 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 60 sections, 9 theorems, 42 equations, 12 figures, 7 tables, 1 algorithm.

Introduction
Practical privacy problem.
Technical gap.
Contribution.
Threat model
Motivation
Related Work
Differential privacy for statistical release and synthetic data
Entropy-based and information-theoretic perspectives
Kernel and functional-space privacy mechanisms
Privacy attacks and empirical evaluation
Relation to private optimization
Position of this work
Preliminaries
Adjacent Datasets
...and 45 more sections

Key Result

Theorem 1

If mechanism $\mathcal{M}_i$ is $(\varepsilon_i,\delta_i)$-DP for $i=1,\ldots,k$, then the composition $(\mathcal{M}_1(D),\ldots,\mathcal{M}_k(D))$ is $\bigl(\sum_{i=1}^k \varepsilon_i,\, \sum_{i=1}^k \delta_i\bigr)$-DP. See Appendix app:proofs for references.

Figures (12)

Figure 1: Privacy test pass rate vs. $\gamma$ for several $k$ ($t=2$).
Figure 2: Wiener kernel: original vs. private mean for $\rho = 10^{-6}$, $0.001$, $0.1$ (left to right).
Figure 3: Entropy sensitivity bound $\Delta_H$ vs. dataset size $n$, illustrating the decrease of the theoretical bound used for calibrated histogram release.
Figure 4: Empirical $\widehat{\Delta H}$ vs. theoretical bound (Theorem \ref{['thm4']}): mean and maximum over adjacent pairs. The ratio remains below 1 in the tested regime, indicating that the empirical entropy sensitivity stays below the theoretical bound.
Figure 5: Baseline comparison: entropy error and count MAE vs. $\varepsilon$ (Laplace, Gaussian, DP synthetic (Laplace), DP synthetic (Gaussian)); $\Delta_H$ bound shown.
...and 7 more figures

Theorems & Definitions (16)

Definition 1: $(\varepsilon,\delta)$-Differential Privacy dwork2006differential
Theorem 1: Sequential Composition
Theorem 2: Advanced Composition dwork2014algorithmic
Theorem 3: Shannon entropy sensitivity under replacement adjacency
Lemma 1
Lemma 2: Neighboring datasets
Lemma 3
Corollary 1
Lemma 4
Theorem 4: Differential privacy of $\mathcal{F}$ under add/remove adjacency
...and 6 more

REAEDP: Entropy-Calibrated Differentially Private Data Release with Formal Guarantees and Attack-Based Evaluation

Abstract

REAEDP: Entropy-Calibrated Differentially Private Data Release with Formal Guarantees and Attack-Based Evaluation

Authors

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (16)