Table of Contents
Fetching ...

Learning Robust and Privacy-Preserving Representations via Information Theory

Binghui Zhang, Sayedeh Leila Noorbakhsh, Yun Dong, Yuan Hong, Binghui Wang

TL;DR

This work proposes an information-theoretic framework to achieve the goals through the lens of representation learning, i.e., learning representations that are robust to both adversarial examples and attribute inference adversaries.

Abstract

Machine learning models are vulnerable to both security attacks (e.g., adversarial examples) and privacy attacks (e.g., private attribute inference). We take the first step to mitigate both the security and privacy attacks, and maintain task utility as well. Particularly, we propose an information-theoretic framework to achieve the goals through the lens of representation learning, i.e., learning representations that are robust to both adversarial examples and attribute inference adversaries. We also derive novel theoretical results under our framework, e.g., the inherent trade-off between adversarial robustness/utility and attribute privacy, and guaranteed attribute privacy leakage against attribute inference adversaries.

Learning Robust and Privacy-Preserving Representations via Information Theory

TL;DR

This work proposes an information-theoretic framework to achieve the goals through the lens of representation learning, i.e., learning representations that are robust to both adversarial examples and attribute inference adversaries.

Abstract

Machine learning models are vulnerable to both security attacks (e.g., adversarial examples) and privacy attacks (e.g., private attribute inference). We take the first step to mitigate both the security and privacy attacks, and maintain task utility as well. Particularly, we propose an information-theoretic framework to achieve the goals through the lens of representation learning, i.e., learning representations that are robust to both adversarial examples and attribute inference adversaries. We also derive novel theoretical results under our framework, e.g., the inherent trade-off between adversarial robustness/utility and attribute privacy, and guaranteed attribute privacy leakage against attribute inference adversaries.

Paper Structure

This paper contains 26 sections, 15 theorems, 46 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Consider all primary task classifiers as $\mathcal{C} = \{ C: \mathcal{Z} \rightarrow \mathcal{Y} \}$. Given the perturbation budget $\epsilon$, for any representation learner $f: \mathcal{X} \rightarrow \mathcal{Z}$,

Figures (3)

  • Figure 1: Overview of ARPRL.
  • Figure 2: 2D representations learnt by ARPRL. (a) Raw data; (b) only robust representations (privacy acc: 99%, robust acc: 88%, test acc: 99%); and (c) robust + privacy preserving representations (privacy acc: 55%, robust acc: 75%, test acc: 85%). red vs. blue: binary private attribute values; cross $\times$ vs. circle $\circ$: binary task labels.
  • Figure 3: 2D t-SNE representations learnt by AdvPPRL. Left: only robust representations; Right: robust + privacy preserving representations (under the best tradeoff in Table \ref{['tab:allresults']}). Colors indicate attribute values, while point patterns mean labels.

Theorems & Definitions (23)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 5
  • proof
  • Definition 1: Lipschitz function and Lipschitz norm
  • Definition 2: Total variance (TV) distance
  • Definition 3: 1-Wasserstein distance
  • ...and 13 more