Table of Contents
Fetching ...

Optimal Piecewise-based Mechanism for Collecting Bounded Numerical Data under Local Differential Privacy

Ye Zheng, Sumita Mishra, Yidan Hu

TL;DR

This work tackles the problem of maximizing data utility when collecting bounded numerical data under local differential privacy. It generalizes the existing three-piece TPM into an m-piece generalized piecewise mechanism (GPM) and derives a framework to obtain closed-form optimal instantiations, including a classical-domain and a circular-domain solution. By combining analytical derivations with off-the-shelf optimization, the authors show that their generalized, piecewise-based mechanisms achieve optimal utility among all generalized piecewise mechanisms and prove improvements for distribution and mean estimation. They further extend the approach to circular data, leveraging a loss transformation to achieve near-optimal performance with explicit closed forms. Empirical evaluations on synthetic benchmarks and real data demonstrate substantial utility gains over existing TPM-based mechanisms and bounded-Laplace variants, with practical applicability to sensor networks and federated learning.

Abstract

Numerical data with bounded domains is a common data type in personal devices, such as wearable sensors. While the collection of such data is essential for third-party platforms, it raises significant privacy concerns. Local differential privacy (LDP) has been shown as a framework providing provable individual privacy, even when the third-party platform is untrusted. For numerical data with bounded domains, existing state-of-the-art LDP mechanisms are piecewise-based mechanisms, which are not optimal, leading to reduced data utility. This paper investigates the optimal design of piecewise-based mechanisms to maximize data utility under LDP. We demonstrate that existing piecewise-based mechanisms are heuristic forms of the $3$-piecewise mechanism, which is far from enough to study optimality. We generalize the $3$-piecewise mechanism to its most general form, i.e. $m$-piecewise mechanism with no pre-defined form of each piece. Under this form, we derive the closed-form optimal mechanism by combining analytical proofs and off-the-shelf optimization solvers. Next, we extend the generalized piecewise-based mechanism to the circular domain (along with the classical domain), defined on a cyclic range where the distance between the two endpoints is zero. By incorporating this property, we design the optimal mechanism for the circular domain, achieving significantly improved data utility compared with existing mechanisms. Our proposed mechanisms guarantee optimal data utility under LDP among all generalized piecewise-based mechanisms. We show that they also achieve optimal data utility in two common applications of LDP: distribution estimation and mean estimation. Theoretical analyses and experimental evaluations prove and validate the data utility advantages of our proposed mechanisms.

Optimal Piecewise-based Mechanism for Collecting Bounded Numerical Data under Local Differential Privacy

TL;DR

This work tackles the problem of maximizing data utility when collecting bounded numerical data under local differential privacy. It generalizes the existing three-piece TPM into an m-piece generalized piecewise mechanism (GPM) and derives a framework to obtain closed-form optimal instantiations, including a classical-domain and a circular-domain solution. By combining analytical derivations with off-the-shelf optimization, the authors show that their generalized, piecewise-based mechanisms achieve optimal utility among all generalized piecewise mechanisms and prove improvements for distribution and mean estimation. They further extend the approach to circular data, leveraging a loss transformation to achieve near-optimal performance with explicit closed forms. Empirical evaluations on synthetic benchmarks and real data demonstrate substantial utility gains over existing TPM-based mechanisms and bounded-Laplace variants, with practical applicability to sensor networks and federated learning.

Abstract

Numerical data with bounded domains is a common data type in personal devices, such as wearable sensors. While the collection of such data is essential for third-party platforms, it raises significant privacy concerns. Local differential privacy (LDP) has been shown as a framework providing provable individual privacy, even when the third-party platform is untrusted. For numerical data with bounded domains, existing state-of-the-art LDP mechanisms are piecewise-based mechanisms, which are not optimal, leading to reduced data utility. This paper investigates the optimal design of piecewise-based mechanisms to maximize data utility under LDP. We demonstrate that existing piecewise-based mechanisms are heuristic forms of the -piecewise mechanism, which is far from enough to study optimality. We generalize the -piecewise mechanism to its most general form, i.e. -piecewise mechanism with no pre-defined form of each piece. Under this form, we derive the closed-form optimal mechanism by combining analytical proofs and off-the-shelf optimization solvers. Next, we extend the generalized piecewise-based mechanism to the circular domain (along with the classical domain), defined on a cyclic range where the distance between the two endpoints is zero. By incorporating this property, we design the optimal mechanism for the circular domain, achieving significantly improved data utility compared with existing mechanisms. Our proposed mechanisms guarantee optimal data utility under LDP among all generalized piecewise-based mechanisms. We show that they also achieve optimal data utility in two common applications of LDP: distribution estimation and mean estimation. Theoretical analyses and experimental evaluations prove and validate the data utility advantages of our proposed mechanisms.

Paper Structure

This paper contains 47 sections, 8 theorems, 44 equations, 21 figures, 3 tables.

Key Result

lemma 1

Assume $\mathcal{D} = [a,b)$, the objective of Formulation (equ:min-max) can be simplified to

Figures (21)

  • Figure 1: Solving flow for the optimal $m$-GPM. Two arrows indicate problems in (\ref{['equ:min-max']}) and (\ref{['equ:lr_i']}).
  • Figure 2: Optimal $4$-GPM and $5$-GPM when $\varepsilon = 1$ and $x = 0.3$. They are identical as $m = 3$ after merging redundant pieces.
  • Figure 3: Solving flow for the optimal closed form.
  • Figure 4: Optimal GPM (Theorem \ref{['theo:optimal_concretization']}) when $\varepsilon = 1$, $x = 0$ and $x=0.5$.
  • Figure 5: Reduced forms of solving the optimal $p_i$, $l_{i,x}^{\text{mod}}$ and $r_{i,x}^{\text{mod}}$. Optimizations under circular distance $\mathcal{L}_{\text{mod}}$ can be reduced to those under linear distance $\mathcal{L}$.
  • ...and 16 more figures

Theorems & Definitions (12)

  • definition 1: $\varepsilon$-LDP DBLP:journals/corr/DuchiWJ16
  • definition 2
  • definition 3
  • lemma 1
  • lemma 2
  • theorem 1: Transformation invariants
  • theorem 2
  • lemma 3
  • theorem 3
  • theorem 4
  • ...and 2 more