Locally Private Estimation with Public Features

Yuheng Ma; Ke Jia; Hanfang Yang

Locally Private Estimation with Public Features

Yuheng Ma, Ke Jia, Hanfang Yang

TL;DR

This work formalizes locally private learning with public features via semi-feature LDP, showing that protection of private features degrades the mini-max rate relative to fully private or fully public regimes. It introduces HistOfTree, a partition-based estimator that couples a private-feature histogram with a public-feature decision-tree, and proves that it attains the mini-max rate in the aligned setting while providing a data-driven tuning strategy for personalized privacy. Theoretical contributions include a semi-feature LDP lower bound and an upper bound on the estimator’s excess risk, plus guidance on choosing the number of private features $s$. Empirically, HistOfTree and its adaptive variant outperform naïve baselines on both synthetic and real datasets across privacy budgets, illustrating practical impact for privacy-aware regression with heterogeneous feature privacy choices.

Abstract

We initiate the study of locally differentially private (LDP) learning with public features. We define semi-feature LDP, where some features are publicly available while the remaining ones, along with the label, require protection under local differential privacy. Under semi-feature LDP, we demonstrate that the mini-max convergence rate for non-parametric regression is significantly reduced compared to that of classical LDP. Then we propose HistOfTree, an estimator that fully leverages the information contained in both public and private features. Theoretically, HistOfTree reaches the mini-max optimal convergence rate. Empirically, HistOfTree achieves superior performance on both synthetic and real data. We also explore scenarios where users have the flexibility to select features for protection manually. In such cases, we propose an estimator and a data-driven parameter tuning strategy, leading to analogous theoretical and empirical results.

Locally Private Estimation with Public Features

TL;DR

. Empirically, HistOfTree and its adaptive variant outperform naïve baselines on both synthetic and real datasets across privacy budgets, illustrating practical impact for privacy-aware regression with heterogeneous feature privacy choices.

Abstract

Paper Structure (38 sections, 15 theorems, 91 equations, 2 figures, 3 tables)

This paper contains 38 sections, 15 theorems, 91 equations, 2 figures, 3 tables.

Introduction
Related Work
Public Data and Features
Personalized Privacy
Methodology
Preliminaries
Notations
Semi-feature LDP
HistOfTree Estimator for Aligned Private Features
Privacy Mechanism
Partition
Personalized Private Features
Privacy Mechanism
Potential Privacy Risk from $W$
Theoretical Results
...and 23 more sections

Key Result

Proposition 3.2

Let $\pi = \{ A\times B\mid A\in\pi^{\text{priv}}, B\in \pi^{\text{pub}}\}$ be any partition of $\mathcal{X}$. Then the privacy mechanism equ:privavyprocedurepersonalize is non-interactively $\varepsilon$-semi-feature LDP.

Figures (2)

Figure 1: Illustration of different $W$, where blue means $W_i^j=1$. In the aligned case (a), all users protect the first two features. In the personalized case, users specify different features, with the protected features being concentrated in (b) and spread in (c). The yellow boundaries represent the $s$ selected private features.
Figure 2: Experiment results on synthetic data. LabelDT and Hist are captioned as label LDP and LDP, respectively. HistOfTree is captioned with specific choice of parameters $s$ and $t$. In \ref{['fig:privacyutility']}, we apply uneven scaling to the x-axis to accommodate the outlying value of 1024, representing the non-private performance. In \ref{['fig:selects']}, AdHistOfTree is captioned as adaptive.

Theorems & Definitions (30)

Definition 3.1: Semi-feature local differential privacy
Proposition 3.2
Theorem 4.2
Theorem 4.3
Corollary 4.4
Proposition A.1
proof : Proof of Proposition \ref{['prop:privacy']}
proof : Proof of Proposition \ref{['prop:privacygeneralized']}
proof : Proof of Theorem \ref{['thm:lowerbound']}
Lemma B.1: Bounding privatised error
...and 20 more

Locally Private Estimation with Public Features

TL;DR

Abstract

Locally Private Estimation with Public Features

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (30)