Table of Contents
Fetching ...

Classification Under Local Differential Privacy with Model Reversal and Model Averaging

Caihong Qin, Yang Bai

TL;DR

This work tackles the classification task under Local Differential Privacy by reframing private learning as transfer learning, where noised data serve as the source and clean targets are unavailable. It introduces a privacy-aware utility evaluation and two core techniques—Model Reversal (MR) and Model Averaging (MA)—to salvage and combine weak, privacy-perturbed classifiers. The authors provide excess-risk bounds demonstrating the benefits of MRMA and extend the framework to functional data via basis projections, with extensive experiments on simulated and real datasets showing substantial accuracy gains under varying privacy levels. The approach broadens the applicability of private learning, supports single- and multi-server settings, and offers practical pathways for privacy-preserving classification in high-dimensional and functional-data contexts.

Abstract

Local differential privacy (LDP) has become a central topic in data privacy research, offering strong privacy guarantees by perturbing user data at the source and removing the need for a trusted curator. However, the noise introduced by LDP often significantly reduces data utility. To address this issue, we reinterpret private learning under LDP as a transfer learning problem, where the noisy data serve as the source domain and the unobserved clean data as the target. We propose novel techniques specifically designed for LDP to improve classification performance without compromising privacy: (1) a noised binary feedback-based evaluation mechanism for estimating dataset utility; (2) model reversal, which salvages underperforming classifiers by inverting their decision boundaries; and (3) model averaging, which assigns weights to multiple reversed classifiers based on their estimated utility. We provide theoretical excess risk bounds under LDP and demonstrate how our methods reduce this risk. Empirical results on both simulated and real-world datasets show substantial improvements in classification accuracy.

Classification Under Local Differential Privacy with Model Reversal and Model Averaging

TL;DR

This work tackles the classification task under Local Differential Privacy by reframing private learning as transfer learning, where noised data serve as the source and clean targets are unavailable. It introduces a privacy-aware utility evaluation and two core techniques—Model Reversal (MR) and Model Averaging (MA)—to salvage and combine weak, privacy-perturbed classifiers. The authors provide excess-risk bounds demonstrating the benefits of MRMA and extend the framework to functional data via basis projections, with extensive experiments on simulated and real datasets showing substantial accuracy gains under varying privacy levels. The approach broadens the applicability of private learning, supports single- and multi-server settings, and offers practical pathways for privacy-preserving classification in high-dimensional and functional-data contexts.

Abstract

Local differential privacy (LDP) has become a central topic in data privacy research, offering strong privacy guarantees by perturbing user data at the source and removing the need for a trusted curator. However, the noise introduced by LDP often significantly reduces data utility. To address this issue, we reinterpret private learning under LDP as a transfer learning problem, where the noisy data serve as the source domain and the unobserved clean data as the target. We propose novel techniques specifically designed for LDP to improve classification performance without compromising privacy: (1) a noised binary feedback-based evaluation mechanism for estimating dataset utility; (2) model reversal, which salvages underperforming classifiers by inverting their decision boundaries; and (3) model averaging, which assigns weights to multiple reversed classifiers based on their estimated utility. We provide theoretical excess risk bounds under LDP and demonstrate how our methods reduce this risk. Empirical results on both simulated and real-world datasets show substantial improvements in classification accuracy.
Paper Structure (27 sections, 6 theorems, 49 equations, 13 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 6 theorems, 49 equations, 13 figures, 5 tables, 1 algorithm.

Key Result

Proposition 2

With the mapping functions $\eta$ and $\eta^{{(\boldsymbol{\varepsilon})}}$ defined above, we have where and

Figures (13)

  • Figure 1: Heatmaps of the weight function $\omega\bigl(z \mid z_0, {\varepsilon_z}\bigr)$ changing with $z$ for various values of $z_0$ and ${\varepsilon_z}$, where we assume $z \sim U(-1,1)$ and $z_0 = z + \delta_z^{{(\varepsilon_z)}}$ with $\delta_z^{{(\varepsilon_z)}}$ drawn from a Laplace distribution having mean zero and scale $2/{\varepsilon_z}$. As ${\varepsilon_z}$ decreases, the value intervals of the color bars become increasingly concentrated around 1.
  • Figure 2: The misclassification rates of classifiers with a single server under $\varepsilon$-LDP.
  • Figure 3: The misclassification rates of classifiers with multi-server under $\varepsilon$-LDP.
  • Figure 4: The misclassification rates of classifiers with a single server under $\varepsilon$-LDP on the physical activity dataset.
  • Figure 5: The misclassification rates of classifiers with a single server under $\varepsilon$-LDP on the Phonemes dataset.
  • ...and 8 more figures

Theorems & Definitions (7)

  • Definition 1: $\varepsilon$-LDP, kasiviswanathan2011can
  • Proposition 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7