Table of Contents
Fetching ...

Private Information Retrieval with Private Noisy Side Information

Hassan ZivariFard, Remi A. Chou

TL;DR

This work extends Private Information Retrieval to a setting where the client holds noisy, privately known side information generated by $D$ test channels and the file-to-channel mapping is unknown to servers; two privacy metrics are analyzed: secrecy of the desired file index and the entire mapping, versus secrecy of the file index and mapping with disclosure of the channel index for the desired file. The authors introduce a multilevel nested random binning (nested source coding) scheme that leverages the noisy SI to remove redundancy and achieve the capacity, which is expressed in closed-form for the undisclosed-mapping case and via an averaged form when channel-index disclosure is allowed. They prove that disclosure of the channel index never increases, and typically reduces, the optimal download cost, thereby connecting to and extending PIR with noiseless SI, storage-constrained SI, and colluding-server PIR. The results unify several PIR variants under a common noisy-SI framework and provide concrete constructions (including nested polar coding in certain degraded-channel regimes) that attain the derived capacity.

Abstract

Consider Private Information Retrieval (PIR), where a client wants to retrieve one file out of $K$ files that are replicated in $N$ different servers and the client selection must remain private when up to $T$ servers may collude. Additionally, suppose that the client has noisy side information about each of the $K$ files, and the side information about a specific file is obtained by passing this file through one of $D$ possible discrete memoryless test channels, where $D\le K$. While the statistics of the test channels are known by the client and by all the servers, the specific mapping $\boldsymbol{\calM}$ between the files and the test channels is unknown to the servers. We study this problem under two different privacy metrics. Under the first privacy metric, the client wants to preserve the privacy of its desired file selection and the mapping $\boldsymbol{\calM}$. Under the second privacy metric, the client wants to preserve the privacy of its desired file and the mapping $\boldsymbol{\calM}$ but is willing to reveal the index of the test channel that is associated to its desired file. For both of these two privacy metrics, we derive the optimal normalized download cost. Our problem setup generalizes PIR with colluding servers, PIR with private noiseless side information, and PIR with private side information under storage constraints.

Private Information Retrieval with Private Noisy Side Information

TL;DR

This work extends Private Information Retrieval to a setting where the client holds noisy, privately known side information generated by test channels and the file-to-channel mapping is unknown to servers; two privacy metrics are analyzed: secrecy of the desired file index and the entire mapping, versus secrecy of the file index and mapping with disclosure of the channel index for the desired file. The authors introduce a multilevel nested random binning (nested source coding) scheme that leverages the noisy SI to remove redundancy and achieve the capacity, which is expressed in closed-form for the undisclosed-mapping case and via an averaged form when channel-index disclosure is allowed. They prove that disclosure of the channel index never increases, and typically reduces, the optimal download cost, thereby connecting to and extending PIR with noiseless SI, storage-constrained SI, and colluding-server PIR. The results unify several PIR variants under a common noisy-SI framework and provide concrete constructions (including nested polar coding in certain degraded-channel regimes) that attain the derived capacity.

Abstract

Consider Private Information Retrieval (PIR), where a client wants to retrieve one file out of files that are replicated in different servers and the client selection must remain private when up to servers may collude. Additionally, suppose that the client has noisy side information about each of the files, and the side information about a specific file is obtained by passing this file through one of possible discrete memoryless test channels, where . While the statistics of the test channels are known by the client and by all the servers, the specific mapping between the files and the test channels is unknown to the servers. We study this problem under two different privacy metrics. Under the first privacy metric, the client wants to preserve the privacy of its desired file selection and the mapping . Under the second privacy metric, the client wants to preserve the privacy of its desired file and the mapping but is willing to reveal the index of the test channel that is associated to its desired file. For both of these two privacy metrics, we derive the optimal normalized download cost. Our problem setup generalizes PIR with colluding servers, PIR with private noiseless side information, and PIR with private side information under storage constraints.
Paper Structure (29 sections, 12 theorems, 51 equations, 5 figures)

This paper contains 29 sections, 12 theorems, 51 equations, 5 figures.

Key Result

Theorem 1

Consider $K$ files that are replicated in $N$ servers, where up to $T$ of them may collude. Then, the optimal normalized download cost of PIR with private noisy side information and undisclosed side information statistics of the desired file is where $\Psi^{-1}(A,B)\triangleq\left(1+A+A^2+\dots+A^{B-1}\right)$, and for $i,j\in\mathbb{N}_*$, $d_{[i:j]}\triangleq\sum_{t=i}^jd_t$, when $i\le j$, and

Figures (5)

  • Figure 1: PIR with private noisy side information and $T$-colluding servers, where the side information about a specific file is obtained by passing this file through one of $D$ possible DMC $\left(C^{(i)}\right)_{i\in[D]}$, where $D\le K$, i.e., for $j\in[K]$, there exists some $i\in[D]$ such that $Y_j^n$ is the output of channel $C^{(i)}$ when $X_j^n$ is the input. Here, $\left(X_i^n\right)_{i\in[K]}$ are the $K$ files that are replicated in $N$ servers, $\left(\mathbf{Q}_i\right)_{i\in[N]}$ are the queries for the servers, and $\left(\mathbf{A}_i\right)_{i\in[N]}$ are the corresponding answers of the servers. $Z$ is the index of the client's file selection and $X^n_Z$ is the desired file by the client.
  • Figure 2: Example with $(K,N,T,D)=(2,1,1,2)$ when the test channels are BEC.
  • Figure 3: Source codes of the files when the side information available at the client is according to the BEC with parameter $\epsilon_1$, i.e., $\mathbf{SC}_1$, and source codes of the files when the side information available at the client is according to the BEC with parameter $\epsilon_2-\epsilon_1$, i.e., $\mathbf{SC}_2$. Note that the source codes considered are nested.
  • Figure 4: Indexing the files based on the mapping $\boldsymbol{\mathcal{M}}$.
  • Figure 5: Dependency graph for all the involved random variables.

Theorems & Definitions (37)

  • Definition 1
  • Example 1: When $K=D=2$, $T=1$, and $d_1=d_2=1$
  • Definition 2: $\mathop{\mathrm{C}}\nolimits_{\hbox{\scriptsize\rm PIR-PNSI}}$ optimal normalized download cost
  • Definition 3: $\mathop{\mathrm{C}}\nolimits_{\hbox{\scriptsize\rm PIR-PNSI}}^*$ optimal normalized download cost
  • Remark 1: Rate definition
  • Example 2: PIR with colluding servers
  • Example 3: PIR with private noiseless side information
  • Example 4: PIR with private side information under storage constraints
  • Theorem 1
  • proof
  • ...and 27 more