Private Information Retrieval over Graphs
Gennian Ge, Hao Wang, Zixiang Xu, Yijun Zhang
TL;DR
This work studies private information retrieval (PIR) on graph-based 2-replication storage, focusing on how topology constrains capacity. It introduces a sharp information-theoretic framework to upper-bound PIR capacity on graphs and derives a novel bound for complete graphs: $\mathcal{C}(K_N) \le \frac{1}{\sum_{i=2}^{N} \frac{1}{i!}} \cdot \frac{1}{N}$, tightening prior results and approaching $\frac{1}{e-2} \cdot \frac{1}{N}$ asymptotically. It also proves an improved bound for balanced complete bipartite graphs and provides a constructive lower bound via a new deterministic PIR scheme over $K_N$, achieving $\mathcal{C}(K_N) \ge (\frac{4}{3}-o(1))/N$, which extends to general graphs. A key conceptual contribution is a bridge between deterministic and probabilistic PIR schemes, enabling subpacketization-1 implementations and yielding practical schemes on sparse graphs. The paper further develops a recursive, pattern-based construction for $K_N$, and a probabilistic transformation that preserves rate while reducing subpacketization, thereby narrowing the gap between upper and lower bounds and enabling scalable, topology-aware PIR designs.
Abstract
The problem of PIR in graph-based replication systems has received significant attention in recent years. A systematic study was conducted by Sadeh, Gu, and Tamo, where each file is replicated across two servers and the storage topology is modeled by a graph. The PIR capacity of a graph $G$, denoted by $\mathcal{C}(G)$, is defined as the supremum of retrieval rates achievable by schemes that preserve user privacy, with the rate measured as the ratio between the file size and the total number of bits downloaded. This paper makes the following key contributions. (1) The complete graph $K_N$ has emerged as a central benchmark in the study of PIR over graphs. The asymptotic gap between the upper and lower bounds for $\mathcal{C}(K_N)$ was previously 2 and was only recently reduced to $5/3$. We shrink this gap to $1.0444$, bringing it close to resolution. More precisely, (i) Sadeh, Gu, and Tamo proved that $\mathcal{C}(K_N)\le 2/(N+1)$ and conjectured this bound to be tight. We refute this conjecture by establishing the strictly stronger bound $\mathcal{C}(K_N) \le \frac{1.3922}{N}.$ We also improve the upper bound for the balanced complete bipartite graph $\mathcal{C}(K_{N/2,N/2})$. (ii) The first lower bound on $\mathcal{C}(K_N)$ was $(1+o(1))/N$, which was recently sharpened to $(6/5+o(1))/N$. We provide explicit, systematic constructions that further improve this bound, proving $\mathcal{C}(K_N)\ge(4/3-o(1))/N,$ which in particular implies $\mathcal{C}(G) \ge (4/3-o(1))/|G|$ for every graph $G$. (2) We establish a conceptual bridge between deterministic and probabilistic PIR schemes on graphs. This connection has significant implications for reducing the required subpacketization in practical implementations and is of independent interest. We also design a general probabilistic PIR scheme that performs particularly well on sparse graphs.
