Table of Contents
Fetching ...

FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits

Sunrit Chakraborty, Saptarshi Roy, Debabrota Basu

TL;DR

It is shown that FLIPHAT achieves optimal regret in terms of privacy parameters, context dimension $d, and time horizon $T$ up to a linear factor in model sparsity and logarithmic factor in model sparsity and logarithmic factor in $d$.

Abstract

High dimensional sparse linear bandits serve as an efficient model for sequential decision-making problems (e.g. personalized medicine), where high dimensional features (e.g. genomic data) on the users are available, but only a small subset of them are relevant. Motivated by data privacy concerns in these applications, we study the joint differentially private high dimensional sparse linear bandits, where both rewards and contexts are considered as private data. First, to quantify the cost of privacy, we derive a lower bound on the regret achievable in this setting. To further address the problem, we design a computationally efficient bandit algorithm, \textbf{F}orgetfu\textbf{L} \textbf{I}terative \textbf{P}rivate \textbf{HA}rd \textbf{T}hresholding (FLIPHAT). Along with doubling of episodes and episodic forgetting, FLIPHAT deploys a variant of Noisy Iterative Hard Thresholding (N-IHT) algorithm as a sparse linear regression oracle to ensure both privacy and regret-optimality. We show that FLIPHAT achieves optimal regret in terms of privacy parameters $ε, δ$, context dimension $d$, and time horizon $T$ up to a linear factor in model sparsity and logarithmic factor in $d$. We analyze the regret by providing a novel refined analysis of the estimation error of N-IHT, which is of parallel interest.

FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits

TL;DR

It is shown that FLIPHAT achieves optimal regret in terms of privacy parameters, context dimension Td$.

Abstract

High dimensional sparse linear bandits serve as an efficient model for sequential decision-making problems (e.g. personalized medicine), where high dimensional features (e.g. genomic data) on the users are available, but only a small subset of them are relevant. Motivated by data privacy concerns in these applications, we study the joint differentially private high dimensional sparse linear bandits, where both rewards and contexts are considered as private data. First, to quantify the cost of privacy, we derive a lower bound on the regret achievable in this setting. To further address the problem, we design a computationally efficient bandit algorithm, \textbf{F}orgetfu\textbf{L} \textbf{I}terative \textbf{P}rivate \textbf{HA}rd \textbf{T}hresholding (FLIPHAT). Along with doubling of episodes and episodic forgetting, FLIPHAT deploys a variant of Noisy Iterative Hard Thresholding (N-IHT) algorithm as a sparse linear regression oracle to ensure both privacy and regret-optimality. We show that FLIPHAT achieves optimal regret in terms of privacy parameters , context dimension , and time horizon up to a linear factor in model sparsity and logarithmic factor in . We analyze the regret by providing a novel refined analysis of the estimation error of N-IHT, which is of parallel interest.
Paper Structure (47 sections, 9 theorems, 88 equations, 3 figures, 1 table, 3 algorithms)

This paper contains 47 sections, 9 theorems, 88 equations, 3 figures, 1 table, 3 algorithms.

Key Result

Theorem 3.2

If $\epsilon, \delta> 0$ and $\epsilon^{2} < \log(1/\delta)$, then for sufficiently large $s^* \log(d/s^*)$ the minimax regret for SLCBs under $(\epsilon,\delta)$-JDP Additionally, for $\delta=0$, we get $R^{\mathrm{minimax}}_{\epsilon}(T) = \Omega(\max \{s^{*} \log^{3/2} (d/s^*) \epsilon^{-1}, {\sqrt{s^* T\log (d/s^*)} }\rbrace).$

Figures (3)

  • Figure 1: (Left) Regret vs $T$ for different privacy level $\epsilon$, (Right) Regret at $T=10000$ vs $\log(d)$ for different $\epsilon$
  • Figure 2: Regret $R(T)$ versus time horizon $T$ in the setting $d=400, s^*=5, K=3$, with different privacy parameters $\epsilon$ and $\delta=0.01$. Contexts iid from normal with autoregressive covariance. (Left) Gaussian observation noise, mean 0 and $\sigma=0.1$, (Right) Bounded observation noise, drawn from $\text{Uniform}(-0.1,0.1)$
  • Figure 3: (Left) Effect of the choice of the tuning parameter $s$ on regret $R(T)$, with $T=10000$ - true $s^*=10$ - for different privacy levels. (Right) Regret at $T=10000$ for different number of arms $K$, keeping $d, s^*$ and other parameters fixed.

Theorems & Definitions (14)

  • Definition 2.1: JDP in LCB
  • Definition 3.1: Minimax regret
  • Theorem 3.2: Lower Bound
  • Theorem 4.1
  • Theorem 5.1
  • Proposition 5.5: Estimation Control for Episode $\ell$
  • Theorem 5.6: Regret bounds for FLIPHAT
  • Definition A.1: dwork2006differential
  • Lemma A.2: dwork2021differentially
  • Definition B.1: Sparse Riesz Condition
  • ...and 4 more