Table of Contents
Fetching ...

Elephants Do Not Forget: Differential Privacy with State Continuity for Privacy Budget

Jiankai Jin, Chitchanok Chuengsatiansup, Toby Murray, Benjamin I. P. Rubinstein, Yuval Yarom, Olga Ohrimenko

TL;DR

ElephantDP addresses the vulnerability of DP systems to budget-tampering by introducing a state-continuity module (SCM) and TEEs to maintain a persistent privacy budget and faithfully execute DP routines. The approach ensures liveness (crash recovery with the latest budget) and DP confidentiality (outputs equivalent to a trusted-curator system), even under adversarial cloud environments and potential collusion. The authors formalize transcript-equivalence to accommodate randomized DP outputs, implement the system on Intel SGX with a distributed SCM, and demonstrate 1.1×–3.2× overheads relative to insecure baselines, with better efficiency on complex queries. This work significantly improves practical deployment of global DP in untrusted settings, enabling secure query interfaces while protecting sensitive data from budget-based reconstruction attacks. The combination of DP theory, secure hardware, and a verifiable state-continuity protocol offers a robust path toward trustworthy data sharing in cloud environments.

Abstract

Current implementations of differentially-private (DP) systems either lack support to track the global privacy budget consumed on a dataset, or fail to faithfully maintain the state continuity of this budget. We show that failure to maintain a privacy budget enables an adversary to mount replay, rollback and fork attacks - obtaining answers to many more queries than what a secure system would allow. As a result the attacker can reconstruct secret data that DP aims to protect - even if DP code runs in a Trusted Execution Environment (TEE). We propose ElephantDP, a system that aims to provide the same guarantees as a trusted curator in the global DP model would, albeit set in an untrusted environment. Our system relies on a state continuity module to provide protection for the privacy budget and a TEE to faithfully execute DP code and update the budget. To provide security, our protocol makes several design choices including the content of the persistent state and the order between budget updates and query answers. We prove that ElephantDP provides liveness (i.e., the protocol can restart from a correct state and respond to queries as long as the budget is not exceeded) and DP confidentiality (i.e., an attacker learns about a dataset as much as it would from interacting with a trusted curator). Our implementation and evaluation of the protocol use Intel SGX as a TEE to run the DP code and a network of TEEs to maintain state continuity. Compared to an insecure baseline, we observe 1.1-3.2$\times$ overheads and lower relative overheads for complex DP queries.

Elephants Do Not Forget: Differential Privacy with State Continuity for Privacy Budget

TL;DR

ElephantDP addresses the vulnerability of DP systems to budget-tampering by introducing a state-continuity module (SCM) and TEEs to maintain a persistent privacy budget and faithfully execute DP routines. The approach ensures liveness (crash recovery with the latest budget) and DP confidentiality (outputs equivalent to a trusted-curator system), even under adversarial cloud environments and potential collusion. The authors formalize transcript-equivalence to accommodate randomized DP outputs, implement the system on Intel SGX with a distributed SCM, and demonstrate 1.1×–3.2× overheads relative to insecure baselines, with better efficiency on complex queries. This work significantly improves practical deployment of global DP in untrusted settings, enabling secure query interfaces while protecting sensitive data from budget-based reconstruction attacks. The combination of DP theory, secure hardware, and a verifiable state-continuity protocol offers a robust path toward trustworthy data sharing in cloud environments.

Abstract

Current implementations of differentially-private (DP) systems either lack support to track the global privacy budget consumed on a dataset, or fail to faithfully maintain the state continuity of this budget. We show that failure to maintain a privacy budget enables an adversary to mount replay, rollback and fork attacks - obtaining answers to many more queries than what a secure system would allow. As a result the attacker can reconstruct secret data that DP aims to protect - even if DP code runs in a Trusted Execution Environment (TEE). We propose ElephantDP, a system that aims to provide the same guarantees as a trusted curator in the global DP model would, albeit set in an untrusted environment. Our system relies on a state continuity module to provide protection for the privacy budget and a TEE to faithfully execute DP code and update the budget. To provide security, our protocol makes several design choices including the content of the persistent state and the order between budget updates and query answers. We prove that ElephantDP provides liveness (i.e., the protocol can restart from a correct state and respond to queries as long as the budget is not exceeded) and DP confidentiality (i.e., an attacker learns about a dataset as much as it would from interacting with a trusted curator). Our implementation and evaluation of the protocol use Intel SGX as a TEE to run the DP code and a network of TEEs to maintain state continuity. Compared to an insecure baseline, we observe 1.1-3.2 overheads and lower relative overheads for complex DP queries.
Paper Structure (47 sections, 4 theorems, 4 figures, 5 tables, 3 algorithms)

This paper contains 47 sections, 4 theorems, 4 figures, 5 tables, 3 algorithms.

Key Result

lemma 1

Assuming the integrity and confidentiality properties of the TEE, the integrity properties of SCM and standard cryptographic assumptions for signatures, authenticated encryption and the hash function, ElephantDP has the following state integrity properties:

Figures (4)

  • Figure 1: The error (RMSE) of attacker's guess for the count query answer under rollback attack: the budget is reset when it is exceeded and each query is repeated $N_R$ times to average out the noise.
  • Figure 2: The overview of ElephantDP where a TEE answers analyst's queries using DP mechanisms on data owner's behalf. During System Setup the data owner uploads the encrypted dataset and state containing the initial privacy budget to the server. It uploads the hash of the encrypted state to a State Continuity Module (SCM) and secret keys to the key storage. When TEE is initialized (not depicted in the illustration), it loads the keys, the data and the state. It checks that the state is most recent by comparing it against the digest maintained by the SCM. When the analyst sends the query (Step 1), TEE computes a DP answer and stores the query counter, the updated budget, the query and the answer in persistent storage (Step 2). It then tries to update the SCM with this new state (Step 3). If the update is successful (e.g., no other TEE has updated the state in the meanwhile), the TEE sends the answer to the analyst. The data owner and the components in orange are assumed to be trusted.
  • Figure 3: The overhead for a single query Average (Avg), Variance (Var), Correlation (Cor), GroupBy (Grp), Shuffle (Shf), and Private Multiplicative Weights (PMW) on the PUMS dataset using ElephantDP (E) and NaiveDP (N).
  • Figure 4: The time overhead (in ms) of answering queries GroupBy and Shuffle DP by NaiveDP (N) and ElephantDP (E) on the PUMS dataset. The time comprises of: dataset loading ($t_L$), answering a query ($t_Q$) and updating the state at SCM by ElephantDP ($t_S$). (a) GroupBy query on the income grouped by the age (same as in Table \ref{['tab:output-a']}). The number of groups ranges in {20, 50, 100} and corresponds to query output size. (b) Shuffle DP query (same as in Table \ref{['tab:output-b']}) on the age column. The output size ranges in {120K, 600K, 1.2M}. All measurements are averaged over 100 runs.

Theorems & Definitions (11)

  • definition 1
  • definition 2: Equivalent transcripts
  • definition 3: DP Confidentiality
  • lemma 1
  • proof
  • lemma 2
  • lemma 3
  • lemma 4
  • proof
  • proof
  • ...and 1 more