Table of Contents
Fetching ...

Distributed Differentially Private Data Analytics via Secure Sketching

Jakob Burkhardt, Hannah Keller, Claudio Orlandi, Chris Schwiegelshohn

TL;DR

This work introduces the linear-transformation model (LTM) for distributed differential privacy, enabling secure linear sketches to be computed across multiple servers via MPC while limiting per-client noise. By leveraging oblivious subspace embeddings and Johnson-Lindenstrauss transforms, the authors provide DP mechanisms for private low-rank approximation and ridge regression that approach central-DP utility with fewer privacy losses than local models. They formalize privacy guarantees under a multi-central model, develop dense and sparse JL-based sketching mechanisms, and connect the approach to cryptographic assumptions and the shuffle model. Empirical results, including MPC-based running-time experiments and real-world data, demonstrate that LTM can interpolate between local and central DP as the number of clients grows, offering practical, scalable DP with strong utility guarantees. Overall, LTM provides a viable middle ground for distributed DP by trading expressiveness for efficiency while preserving strong privacy and utility in linear-algebraic tasks.

Abstract

We introduce the linear-transformation model, a distributed model of differentially private data analysis. Clients have access to a trusted platform capable of applying a public matrix to their inputs. Such computations can be securely distributed across multiple servers using simple and efficient secure multiparty computation techniques. The linear-transformation model serves as an intermediate model between the highly expressive central model and the minimal local model. In the central model, clients have access to a trusted platform capable of applying any function to their inputs. However, this expressiveness comes at a cost, as it is often prohibitively expensive to distribute such computations, leading to the central model typically being implemented by a single trusted server. In contrast, the local model assumes no trusted platform, which forces clients to add significant noise to their data. The linear-transformation model avoids the single point of failure for privacy present in the central model, while also mitigating the high noise required in the local model. We demonstrate that linear transformations are very useful for differential privacy, allowing for the computation of linear sketches of input data. These sketches largely preserve utility for tasks such as private low-rank approximation and private ridge regression, while introducing only minimal error, critically independent of the number of clients.

Distributed Differentially Private Data Analytics via Secure Sketching

TL;DR

This work introduces the linear-transformation model (LTM) for distributed differential privacy, enabling secure linear sketches to be computed across multiple servers via MPC while limiting per-client noise. By leveraging oblivious subspace embeddings and Johnson-Lindenstrauss transforms, the authors provide DP mechanisms for private low-rank approximation and ridge regression that approach central-DP utility with fewer privacy losses than local models. They formalize privacy guarantees under a multi-central model, develop dense and sparse JL-based sketching mechanisms, and connect the approach to cryptographic assumptions and the shuffle model. Empirical results, including MPC-based running-time experiments and real-world data, demonstrate that LTM can interpolate between local and central DP as the number of clients grows, offering practical, scalable DP with strong utility guarantees. Overall, LTM provides a viable middle ground for distributed DP by trading expressiveness for efficiency while preserving strong privacy and utility in linear-algebraic tasks.

Abstract

We introduce the linear-transformation model, a distributed model of differentially private data analysis. Clients have access to a trusted platform capable of applying a public matrix to their inputs. Such computations can be securely distributed across multiple servers using simple and efficient secure multiparty computation techniques. The linear-transformation model serves as an intermediate model between the highly expressive central model and the minimal local model. In the central model, clients have access to a trusted platform capable of applying any function to their inputs. However, this expressiveness comes at a cost, as it is often prohibitively expensive to distribute such computations, leading to the central model typically being implemented by a single trusted server. In contrast, the local model assumes no trusted platform, which forces clients to add significant noise to their data. The linear-transformation model avoids the single point of failure for privacy present in the central model, while also mitigating the high noise required in the local model. We demonstrate that linear transformations are very useful for differential privacy, allowing for the computation of linear sketches of input data. These sketches largely preserve utility for tasks such as private low-rank approximation and private ridge regression, while introducing only minimal error, critically independent of the number of clients.

Paper Structure

This paper contains 17 sections, 15 theorems, 18 equations, 4 figures, 8 tables, 1 algorithm.

Key Result

Lemma 3.3

Let $f:\mathcal{X} \to {\mathbb{R}}^k$ be a function and let $\varepsilon \geq 0$ and $\delta \in [0,1]$. The Gaussian mechanism adds to each of the $k$ components of the output, noise sampled from $N(0,\sigma^2)$ with where $\Delta_2f = \max_{x \sim x'} \lVert f(x) - f(x') \rVert_2$ denotes the $\ell_2$ sensitivity of function $f$. The Gaussian mechanism is $(\varepsilon,\delta)$ differentially

Figures (4)

  • Figure 1: Adversary's View
  • Figure 2: Plots depicting the asymptotic behavior of error $\psi$ for $\varepsilon \in \{ 0.1,0.5 \}$ (top, bottom) and $k \in \{ 5,10 \}$ (left, right), with $d=50$. The gray line depicts the error of the local mechanism, and the orange and lime lines depict our approach using Gaussian and Laplacian noise respectively. The other lines resemble different values of $p$ when using Gaussian noise. The standard deviations are depicted by the vertical black lines and the $x$-axis is logarithmic in the number of clients $n$.
  • Figure 3: Plots depicting the asymptotic behavior of error $\phi$ for $\varepsilon \in \{ 0.01,0.03,0.05,0.1 \}$ (top left, top right, bottom left, bottom right), with $d=10$, $\lambda = 10$ and $\mu^2 = n$. The grey line depicts the error of the local mechanism, the blue line does it in the central model and the orange one depicts our approach. The other lines resemble different values of $p$. In most cases the standard deviations are so small, that it is not possible to see those.
  • Figure 4: Plots depicting the error $\phi$ as $n$ increases, for $\varepsilon \in \{ 0.01,0.03,0.05,0.1 \}$ (top to bottom) and $\lambda \in \{ 1,10,100 \}$ (left to right). The middle row depicts the same choice of parameters as Figure \ref{['fig:experiment']}.

Theorems & Definitions (19)

  • Definition 3.1: OSNAP nelson2013osnap.
  • Definition 3.2: Differential Privacy in the central model DMNS06
  • Lemma 3.3: The Gaussian Mechanism dwork2014
  • Lemma 3.4: The Laplace Mechanism DMNS06
  • Lemma 3.5: Sequential Composition DMNS06
  • Definition 4.1: Trusted Computation Model for Differential Privacy
  • Definition 4.2: Instantiation of Trusted Computation Model for Differential Privacy with MPC
  • Lemma 4.3
  • Theorem 4.4
  • Corollary 4.7
  • ...and 9 more